Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-4364] Add Pylint to CI #5238

Merged
merged 1 commit into from
May 30, 2019
Merged

[AIRFLOW-4364] Add Pylint to CI #5238

merged 1 commit into from
May 30, 2019

Conversation

BasPH
Copy link
Contributor

@BasPH BasPH commented May 5, 2019

This PR adds Pylint to the CI, initially all files are blacklisted and the idea is to remove from the blacklist and make pylint compatible one-by-one. The default Pylint configuration is quite strict so I ran Pylint with default settings, collected all messages and went over every single message type and decided if it should be kept, changed or dropped.

The list of all messages with default Pylint configuration on Airflow master:

Error Total Action
C0111 (missing-docstring) 4770 Valid error, no config change
C0103 (invalid-name) 3030 - Constants (1077): Currently all different styles are found in the codebase. Set to any.
- Variables (1643): See all different styles, everything seems valid to me, kept at snake_case.
- Functions (35): Not too many errors, kept at snake_case.
- Arguments (146): Kept snake_case but allowed for 1 char argument names
- Methods (36): Kept on snake_case.
- Classes (16): Kept on PascalCase.
C0330 (bad-continuation) 793 Invalid error, ignored
E0401 (import-error) 712 Ignored to keep Pylint CI task fast. Tests with devel env installed should fail anyways on missing dependencies.
C0301 (line-too-long) 690 Set line length to 110
W0212 (protected-access) 525 Valid error, no change
R0801 (Similar lines in X files) 404 invalid, ignored
R0913 (too-many-arguments) 360 Default at 5, set to 80% of the error values -> 10
C0411 (wrong-import-order) 330 Valid error, no change. More discussion here: #4892 (comment)
W0223 (abstract-method) 320 Don't think it's necessary, ignored
W1113 (keyword-arg-before-vararg) 276 Unnecessary, ignored
R0201 (no-self-use) 271 Very common, not really a problem, ignored
W0613 (unused-argument) 268 Valid error, no change
R0401 (cyclic-import) 148 Valid error, no change
E1101 (no-member) 141 Valid error, no change
R1705 (no-else-return) 137 Don't consider this a problem. Ignored, can always undo in the future.
R0914 (too-many-locals) 132 Default on 15, set to 80% of the error values -> 24
R0902 (too-many-instance-attributes) 122 Default on 7, set to 80% of the error values -> 15
W0703 (broad-except) 107 Valid error, no change
W0201 (attribute-defined-outside-init) 107 Seen some cases in Airflow where this was indeed confusing, keep config
W0104 (pointless-statement) 91 Mostly thrown on the bitshift operation in example DAGs (72x), but also saw some valid errors (19x). There's a discussion to ignore specific messages for specific directories for years: More info: pylint-dev/pylint#2584. For now, I'll explicitly disable pointless-statement in all example DAGs and keep the message.
R0903 (too-few-public-methods) 88 Not really an issue, ignored
W1505 (deprecated-method) 79 Valid, keeping config
E1120 (no-value-for-parameter) 74 Raised with @provide_session and some other (valid) cases. Would like to ignore in case of @provide_session, but currently not possible. Asked on Pylint Github: pylint-dev/pylint#2894. Keeping config for now.
C1801 (len-as-condition) 73 Valid, keep config
W0612 (unused-variable) 65 Ignored args and kwargs keep checking other variables
R0205 (useless-object-inheritance) 65 Should all be resolved after https://issues.apache.org/jira/browse/AIRFLOW-4206, keep config
W0231 (super-init-not-called) 60 I believe this is a valid error, but there are simply too many hooks subclassing from BaseHook and not calling super() to keep it. Ignoring for now and hopefully cleanup later.
R1720 (no-else-raise) 57 Not really a problem, ignored
C0412 (ungrouped-imports) 55 Valid, keep config
W0621 (redefined-outer-name) 54 Valid, keep config
W0511 (fixme) 53 There should be a good reason for a TODO, but I don't see a problem adding one. Ignore.
W0622 (redefined-builtin) 52 Valid, keep config
W0221 (arguments-differ) 45 Doesn't always seem valid, ignored
R0912 (too-many-branches) 43 Default 12, set to 80% of error values -> 22
R1710 (inconsistent-return-statements) 39 Valid, keep config
W0107 (unnecessary-pass) 36 In case of empty function, docstring acts as statement and pass is superfluous, keep config
E1123 (unexpected-keyword-arg) 36 Valid, keep config
R0915 (too-many-statements) 35 Default at 50, set to 80% of the error values -> 69
C0413 (wrong-import-position) 31 Valid, keep config
R0904 (too-many-public-methods) 30 Default at 20, set to 50% of the error values -> 27
W1202 (logging-format-interpolation) 23 Valid, keep config
W0611 (unused-import) 22 Valid, keep config
W0603 (global-statement) 20 Valid, keep config
E1111 (assignment-from-no-return) 20 Valid, keep config
W0106 (expression-not-assigned) 18 Sometimes valid, sometimes not. Ignore manually in code
C0123 (unidiomatic-typecheck) 18 Valid, keep config
W0222 (signature-differs) 17 Valid, keep config
W0235 (useless-super-delegation) 16 Valid, keep config
C0302 (too-many-lines) 16 Good indication the file should be split, keep config
C0122 (misplaced-comparison-constant) 16 Valid, keep config
C0325 (superfluous-parens) 15 Valid, keep config
W0715 (raising-format-tuple) 13 Valid, keep config
E0611 (no-name-in-module) 11 Valid, keep config
W0404 (reimported) 10 Valid, keep config
C0121 (singleton-comparison) 10 Valid, keep config
E0213 (no-self-argument) 9 Valid, keep config
R1714 (consider-using-in) 8 Valid, keep config
W1201 (logging-not-lazy) 7 Valid, keep config
R1718 (consider-using-set-comprehension) 7 Valid, keep config
R0911 (too-many-return-statements) 7 Indication there might be a nicer way to solve returns, keep config
R1719 (simplifiable-if-expression) 6 Valid, keep config
W0105 (pointless-string-statement) 5 Valid, keep config
R1704 (redefined-argument-from-local) 5 Valid, keep config
E1305 (too-many-format-args) 5 Pylint fails on multiline "".format(), ignore.
E0602 (undefined-variable) 5 Seems valid, keep config and ignore inline
W1308 (duplicate-string-formatting-argument) 4 Valid, keep config
W0102 (dangerous-default-value) 4 Valid, keep config
E1102 (not-callable) 4 Valid, keep config
W1509 (subprocess-popen-preexec-fn) 3 Valid, keep config
W0640 (cell-var-from-loop) 3 Code seems valid, ignore error
W0631 (undefined-loop-variable) 3 Valid, keep config
R1711 (useless-return) 3 Valid, keep config
R1707 (trailing-comma-tuple) 3 Valid, keep config
R1703 (simplifiable-if-statement) 3 Valid, keep config
C0200 (consider-using-enumerate) 3 Valid, keep config
W0706 (try-except-raise) 2 Valid, keep config
W0602 (global-variable-not-assigned) 2 Valid, keep config
W0150 (lost-exception) 2 Valid, keep config
R1701 (consider-merging-isinstance) 2 Valid, keep config
E1205 (logging-too-many-args) 2 Valid, keep config
E1136 (unsubscriptable-object) 2 Valid, keep config
E1133 (not-an-iterable) 2 Valid, keep config
E1121 (too-many-function-args) 2 Valid, keep config
E0211 (no-method-argument) 2 Valid, keep config
E0203 (access-member-before-definition) 2 Valid, keep config
C0303 (trailing-whitespace) 2 Valid, keep config
C0201 (consider-iterating-dictionary) 2 Valid, keep config
W1503 (redundant-unittest-assert) 1 Valid, keep config
W0702 (bare-except) 1 Valid, keep config
W0604 (global-at-module-level) 1 Valid, keep config
W0601 (global-variable-undefined) 1 Valid, keep config
W0402 (deprecated-module) 1 Valid, keep config
W0401 (wildcard-import) 1 Valid, keep config
W0233 (non-parent-init-called) 1 Valid, keep config
W0125 (using-constant-test) 1 Valid, keep config
R1717 (consider-using-dict-comprehension) 1 Valid, keep config
R1715 (consider-using-get) 1 Valid, keep config
R1702 (too-many-nested-blocks) 1 5 seems sensible default, keep config
R0916 (too-many-boolean-expressions) 1 5 seems sensible default, keep config
E0303 (invalid-length-returned) 1 Valid, keep config
E0202 (method-hidden) 1 Valid, keep config
C0113 (unneeded-not) 1 Valid, keep config
C0102 (blacklisted-name) 1 Removed foo, bar & baz from the backlisted names. Useful for example DAGs.

Make sure you have checked all steps below.

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-XXX] My Airflow PR"
    • https://issues.apache.org/jira/browse/AIRFLOW-4364
    • In case you are fixing a typo in the documentation you can prepend your commit with [AIRFLOW-XXX], code changes always need a Jira issue.
    • In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal (AIP).
    • In case you are adding a dependency, check if the license complies with the ASF 3rd Party License Policy.

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Check Pylint on changed lines.

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

Code Quality

  • Passes flake8

@BasPH BasPH changed the title [AIRFLOW-4364] Add Pylint to CI [WIP][AIRFLOW-4364] Add Pylint to CI May 5, 2019
@potiuk potiuk self-requested a review May 5, 2019 17:10
@BasPH BasPH changed the title [WIP][AIRFLOW-4364] Add Pylint to CI [AIRFLOW-4364] Add Pylint to CI May 5, 2019
@potiuk
Copy link
Member

potiuk commented May 5, 2019

Hey @BasPH - I will take a closer look later today as I am very interested :) (travelling today) but I have one question/idea which we might implement with this change. We can defer it for the future as well, but I think maybe it's worth starting now.

Since we are starting to use more linters, I thought maybe we could already switch to using pre-commit-hook framework for those linters.

https://pre-commit.com/

It allows to run checks on CI but (what is more important) it can run the very same checks as pre-commit-hooks. It is really nicely implemented - has nice UI, allows to add many ready-to-use linters and checkers (and some automated code modification like adding licence headers) and it is super-easy to install locally by the developer. And it has pluginable interface where it can already (I believe) filters only changed files (not lines by default though).

As local pre-commit check, It could be run as pre-commit for all locally modified files, so that people are encouraged to fix error faster. And on Travis we could continue checking only modified lines for example.

I think you could fairly easily turn your python script into a pre-commit plugin rather than have a standalone script and then we could benefit from being able to run the checks with pre-commit hooks (which is far better than waiting for Travis).

I discovered it recently and applied successfully to the Ooozie2Airflow converter we work on - we applied some 20+ checks. You can see for example here:
https://travis-ci.org/GoogleCloudPlatform/cloud-composer/builds/528367055#L1638

And here is the list of checks we have implemented in our project:

Formats python files using black...................................................Passed
Add licence for all XML, md files..................................................Passed
Add licence for all .pig files.....................................................Passed
Add licence for all python/yaml/property files.....................................Passed
Add licence for all Jinja templates................................................Passed
No-tabs checker....................................................................Passed
Flake8.............................................................................Passed
Check that executables have shebangs...............................................Passed
Check for merge conflicts..........................................................Passed
Check Xml..........................................................................Passed
Check Yaml.........................................................................Passed
Debug Statements (Python)..........................................................Passed
Detect Private Key.................................................................Passed
Fix python encoding pragma.........................................................Passed
Fix End of Files...................................................................Passed
Mixed line ending..................................................................Passed
Fix requirements.txt...............................................................Passed
Trim Trailing Whitespace...........................................................Passed
Check hooks apply to the repository................................................Passed
Check for useless excludes.........................................................Passed
Checks typing annotations consistency with mypy....................................Passed
Checks for common programming errors with pylint...................................Passed
Runs all unit tests with pytest....................................................Passed
Check Shell scripts syntax corectness..............................................Passed
Detect unicode non-breaking space character U+00A0 aka M-BM-.......................Passed
Remove unicode non-breaking space character U+00A0 aka M-BM-.......................Passed
Detect the EXTREMELY confusing unicode character U+2013............................Passed
Remove the EXTREMELY confusing unicode character U+2013............................Passed
Validates all oozie workflows......................................................Passed
Checks for security vulnerabilities in dependencies................................Passed

@codecov-io
Copy link

codecov-io commented May 5, 2019

Codecov Report

Merging #5238 into master will increase coverage by <.01%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5238      +/-   ##
==========================================
+ Coverage   78.94%   78.94%   +<.01%     
==========================================
  Files         480      480              
  Lines       30153    30153              
==========================================
+ Hits        23803    23804       +1     
+ Misses       6350     6349       -1
Impacted Files Coverage Δ
...ow/contrib/example_dags/example_gcp_compute_igm.py 0% <ø> (ø) ⬆️
...irflow/contrib/example_dags/example_gcp_spanner.py 0% <ø> (ø) ⬆️
...ntrib/example_dags/example_gcp_natural_language.py 0% <ø> (ø) ⬆️
...rflow/contrib/example_dags/example_gcp_transfer.py 0% <ø> (ø) ⬆️
...irflow/contrib/example_dags/example_gcp_compute.py 0% <ø> (ø) ⬆️
airflow/contrib/example_dags/example_gcp_vision.py 0% <ø> (ø) ⬆️
airflow/contrib/example_dags/example_gcs_acl.py 0% <ø> (ø) ⬆️
...flow/contrib/example_dags/example_gcp_translate.py 0% <0%> (ø) ⬆️
...rib/example_dags/example_gcp_bigtable_operators.py 0% <0%> (ø) ⬆️
...flow/contrib/example_dags/example_gcp_sql_query.py 0% <0%> (ø) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6f684d0...05854d4. Read the comment docs.

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a general comment. Maybe we should really not care about changed lines? I think pylint will be introduced way faster if we limit it per file rather than per changed line. It's a bit "harsh" on the individual contributors on one hand, but it will be much better for the community.

cell-var-from-loop, # Raises spurious errors
super-init-not-called, # BasPH: ignored for now but should be fixed somewhere in the future
arguments-differ, # Doesn't always raise valid messages
import-error, # Requires installing Airflow environment in CI task which takes long, therefore ignored. Tests should fail anyways if deps are missing. Possibly un-ignore in the future if we ever use pre-built Docker images for CI.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will hopefully soon - as most of the stability problems with tests are fixed now, I am resuming working on this :)

scripts/ci/ci_pylint.py Outdated Show resolved Hide resolved
@@ -0,0 +1,104 @@
# Licensed to the Apache Software Foundation (ASF) under one or more
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add shebang here?

#!/usr/bin/env python3
And make the script executable. That would be helpful to run in standalone and indicate python3 compatibility


# Get Python files to run Pylint against.
# Git command from https://github.com/sk-/git-lint/blob/master/gitlint/git.py
git_modified_files_cmd = "git status --porcelain --untracked-files=all --ignore-submodules=all".split()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand think this command will return no files in CI, but the purpose is to run tests on all locally modified files only. Maybe we should mention somewhere in the readme that the script might be run locally, not only in CI environment then (and how).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I borrowed this method from git-lint. The shell script calls git --reset on the repo to make everything locally modified, and this Python script can then pick it up. This script can be used both locally and in Travis, by starting the ci_pylint.sh script.

I will add documentation and more comments.

for filename in py_files_to_check:
git_blame_cmd = "git blame --porcelain {0}".format(filename).split()
try:
with open(os.devnull, "w") as devnull:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we are using python 3.5+ maybe this will be better:
https://docs.python.org/3/library/contextlib.html#contextlib.redirect_stderr

filename_lines[filename] = None
continue

# We only check lines starting with 40 zeros. Therefore run this script only from ci_pylint.sh
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to figure out that 40 zeros means "not-yet-committed". Maybe we should mention that this is the intention

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Will do.

git cat-file -e ${FIRST_COMMIT} || git reset --soft $TRAVIS_COMMIT~1

python $(dirname ${BASH_SOURCE[0]})/ci_pylint.py
else
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we reset all back to the original state here? I think it is nice for any other potential scripts running after this one.

FIRST_COMMIT=${TRAVIS_COMMIT_RANGE:0:12}

# First commit in range exists. This should pick up regular commits.
git cat-file -e ${FIRST_COMMIT} && git reset --soft ${TRAVIS_COMMIT_RANGE%...*}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question - I could not find it quickly but maybe we should get somehow the base branch for the PR and make the check for all changes? From what I see now, we will only run pylint check for the last pushed commit if we add a new one in the existing PR, but maybe we should run it for all the commits in PR ? It should not change much but maybe it can be simplified a bit in this case (no distinction of normal/force-push commits).

@BasPH
Copy link
Contributor Author

BasPH commented May 5, 2019

@potiuk The script was a bit tricky to get right. Will comment on all individual comments after this.

About the path to complete Pylint compatibility: I'd love to have Airflow completely Pylint compatible asap. I think checking only changed lines is the fastest way to have some form of Pylint in Airflow, and then we (the community) can work on making all code compatible in the meantime.

Since checking only changed lines is a bit messy and requires these extra scripts, another option is to first make everything compatible, touch 99% of the codebase in a single PR (or split up PRs per module/package) and then enable Pylint in the CI afterwards.

What do you think?

@potiuk
Copy link
Member

potiuk commented May 5, 2019

I agree the per-line script is pretty messy. I'd be all for making a big pylint update (but then a lot of PRs will have merge conflicts). And we will have a lot of problems with merging stuff to v1-10* branches. But maybe it's worth it :). As long as we provide some tools for all the contributors to fix/test their changes locally (like pre-commit hooks), maybe it's the best way to implement it.

It's a bit harsh, but maybe we can run this through community and ask them for feedback what they think about it ? I think personally it's the best way of introducing such changes - rather than fix everything, give all the contributors the tools and support in case of questions (like be ready to quickly answer questions - whether we should disable a rule or fix this particular case) and let them convert their PRs in a distributed fashion on their own.

Copy link
Member

@feluelle feluelle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BasPH Thank you for the addition of Pylint to make our code a better place :P

I added a few comments maybe you can take a look and state your opinion. I am aware of that we will have a lot to refactor (even more when accepting my suggestions) to have a fully linted code base, but I would be glad to help out on that :)

.pylintrc Outdated
#function-rgx=

# Good variable names which should always be accepted, separated by a comma.
good-names=i,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't like these exceptions of good names. A good name for me is something that well describes its purpose.

The only name I would leave there is the _ because it actually has a meaning - ignore that value

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those values are quite common for a a number of simple loops (i,jk) , 'ex' being universally understood for Exception and '_' of course. I'd remove Run though ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points, for me the use of i in this example is pretty clear and used by many Python developers:

for i in range(100):
    ....

So I suggest to leave that in.

I removed Run, not sure how that's ever a good variable name.

.pylintrc Outdated Show resolved Hide resolved
.pylintrc Show resolved Hide resolved
.pylintrc Outdated Show resolved Hide resolved
.pylintrc Outdated Show resolved Hide resolved
airflow/contrib/example_dags/example_dingding_operator.py Outdated Show resolved Hide resolved
@BasPH
Copy link
Contributor Author

BasPH commented May 25, 2019

@potiuk Another option might be to blacklist all files, enforce Pylint in the CI (on full files - not changed lines), and one-by-one, or directory-by-directory, make all files compatible and remove from the blacklist, until everything is done.

@potiuk
Copy link
Member

potiuk commented May 25, 2019

@BasPH - I like that much more. Such a blacklist would be our TO-DO list. and we could have a separate JIRA issue for which we could report any commits that are fixing PyLint. We could then make a similar effort as we do with dropping Python 2. I am happy to help with that :)

@BasPH
Copy link
Contributor Author

BasPH commented May 25, 2019

@potiuk Cool, that would work + no more hacky script for figuring out changed lines :) What do you think of the following:

I have this command to run Pylint:

find . -name "*.py" -not -path "./.eggs/*" -not -path "./airflow/www/node_modules/*" -not -path "./airflow/_vendor/*" | \
grep -v 'airflow/[a-zA-Z_]*.py' | \
grep -v 'tests/[a-zA-Z_]*.py' | \
grep -v airflow/api | \
grep -v airflow/bin | \
grep -v airflow/config_templates | \
grep -v airflow/contrib/auth | \
grep -v airflow/contrib/hooks | \
grep -v airflow/contrib/operators | \
grep -v airflow/contrib/plugins | \
grep -v airflow/contrib/sensors | \
grep -v airflow/contrib/task_runner | \
grep -v airflow/contrib/utils | \
grep -v airflow/dag | \
grep -v airflow/example_dags | \
grep -v airflow/executors | \
grep -v airflow/hooks | \
grep -v airflow/jobs | \
grep -v airflow/kubernetes | \
grep -v airflow/lineage | \
grep -v airflow/macros | \
grep -v airflow/migrations | \
grep -v airflow/models | \
grep -v airflow/operators | \
grep -v airflow/security | \
grep -v airflow/sensors | \
grep -v airflow/task | \
grep -v airflow/ti_deps | \
grep -v airflow/utils | \
grep -v airflow/www | \
grep -v dags | \
grep -v docs | \
grep -v scripts | \
grep -v setup.py | \
grep -v tests/api | \
grep -v tests/cli | \
grep -v tests/contrib/hooks | \
grep -v tests/contrib/operators | \
grep -v tests/contrib/sensors | \
grep -v tests/contrib/task_runner | \
grep -v tests/contrib/utils | \
grep -v tests/executors | \
grep -v tests/hooks | \
grep -v tests/jobs | \
grep -v tests/kubernetes | \
grep -v tests/lineage | \
grep -v tests/macros | \
grep -v tests/migrations | \
grep -v tests/minikube | \
grep -v tests/models | \
grep -v tests/operators | \
grep -v tests/plugins | \
grep -v tests/security | \
grep -v tests/sensors | \
grep -v tests/task | \
grep -v tests/test_utils | \
grep -v tests/ti_deps | \
grep -v tests/utils | \
grep -v tests/www | \
xargs pylint --output-format=colorized

For every grep line (58 total), I create an issue. In the issue, the person will remove the particular grep line to pass more files to Pylint, and fix all Pylint issues that then occur. Once all grep lines are finally removed, the command will be:

find . -name "*.py" -not -path "./.eggs/*" -not -path "./airflow/www/node_modules/*" -not -path "./airflow/_vendor/*" | xargs pylint --output-format=colorized

If you agree, I will change this PR by removing the changed-line-checking scripting and running this command in the CI pipeline.

@BasPH
Copy link
Contributor Author

BasPH commented May 30, 2019

@potiuk I've made a list (scripts/ci/pylint_todo.txt) of all the files to exclude from linting. All is green on my machine + own Travis. If you accept and merge once Travis is done, we can create issues from the list above and have everybody make Airflow Pylint compatible 🙂

@potiuk potiuk merged commit 669b026 into apache:master May 30, 2019
@potiuk
Copy link
Member

potiuk commented May 30, 2019

All Good @BasPH . I will be travelling to the Bay Area Meetup tomorrow and pretty busy with meetings in Bay Area in next week so I won't be able to do a lot over the next week or so, but I will do my best to help with it.

@pgagnon
Copy link
Contributor

pgagnon commented May 31, 2019

I wish this had been introduced as a non-failing check at least at the beginning. Pylint is a notoriously prickly linter...

@BasPH
Copy link
Contributor Author

BasPH commented Jun 2, 2019

Hi @pgagnon could you clarify? All files are blacklisted so it should succeed right now. Only (obviously) when you remove files from the blacklist, will you see failures. But that's also the goal, to make everything Pylint-compatible and eventually have much more consistent and readable code.

@potiuk
Copy link
Member

potiuk commented Jun 2, 2019

Yeah. Also I it should be possible to disable some rules in parts of the files in case we find it non-compliant and there is a good reason for it. It's one of the valid ways of dealing with such issues.

Also that made me think @BasPH -> I thought maybe we should also have a very short explanation in CONTRIBUTING.md on the Pylint process we are going to follow - once we create JIRA issues. We could link to the JIRA issue and some links to pylint docs explaining how to deal with checks in case there is no easy fix (mainly #pylint disable/enable comments), because not everyone realises that.

@pgagnon
Copy link
Contributor

pgagnon commented Jun 2, 2019

@BasPH I didn't realize at first that a blacklist mechanism was in place. Nevertheless I feel that we should disable missing-docstring for now until (1) we are able to have more granular control over it, and (2) have some kind of docstring format standard in place for the project.

andriisoldatenko pushed a commit to andriisoldatenko/airflow that referenced this pull request Jul 26, 2019
wmorris75 pushed a commit to modmed/incubator-airflow that referenced this pull request Jul 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants