Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-5390] Remove provide context #5990

Merged
merged 22 commits into from
Sep 10, 2019

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented Sep 3, 2019

Make sure you have checked all steps below.

I'm giving Apache Airflow training across Europe, and at every workshop that I provide, I find it a bit awkward to introduce the idea of the provide_context. Instead, I want to remove this thing and make it infer the variables automagically based on the signature of the Python callable. Less is more in terms of the public API :-)

Jira

  • My PR addresses the following Airflow Jira issues and references them in the PR title. For example, "[AIRFLOW-5390] My Airflow PR"
    • https://issues.apache.org/jira/browse/AIRFLOW-5390
    • In case you are fixing a typo in the documentation you can prepend your commit with [AIRFLOW-XXX], code changes always need a Jira issue.
    • In case you are proposing a fundamental code change, you need to create an Airflow Improvement Proposal (AIP).
    • In case you are adding a dependency, check if the license complies with the ASF 3rd Party License Policy.

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

Code Quality

  • Passes flake8

airflow/operators/python_operator.py Outdated Show resolved Hide resolved
airflow/operators/python_operator.py Outdated Show resolved Hide resolved
Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting idea!

I'd like to see expanded tests on this, as I think this might break op_kwargs a bit. For example this would print 1 before, but now I think will print something else.

PythonOperator(
    op_kwargs={'dag': 1},
    python_callable: lambda dag: print(dag)
)

This needs a note in UPDATING.md too saying how it works.

I think if we leave (but ignore/warn about) provide_context we could make this mostly a non-breaking change which makes our users' lives easier when upgrading

@ashb
Copy link
Member

ashb commented Sep 3, 2019

Another example: what happens if I do:

def fn(dag, **context):
    print(dag)

PythonOperator(
    op_args=[1],
    python_callable=fn
)

@Fokko
Copy link
Contributor Author

Fokko commented Sep 3, 2019

Thanks @ashb for thinking along. Appreciate it. I've added the tests to the suite.

I think we can fix the one with the op_args by skipping the first len(op_args) number of arguments. This would not introduce any change in behavior. Similar to the arguments in the kwargs, we could give the keys in op_kwargs priority.

kwargs is actually handled: https://github.com/apache/airflow/blob/master/airflow/operators/python_operator.py#L108-L109

@Fokko Fokko closed this Sep 4, 2019
@Fokko
Copy link
Contributor Author

Fokko commented Sep 4, 2019

Everything is green now

Changed the behavior. This following snippet will throw an exception now, since dag is a reserved keyword:

def fn(dag, **context):
    print(dag)

PythonOperator(
    op_args=[1],
    python_callable=fn
)

Otherwise it might become very confusing when you change the op_args etc.

@Fokko Fokko reopened this Sep 4, 2019
@codecov-io
Copy link

codecov-io commented Sep 9, 2019

Codecov Report

Merging #5990 into master will decrease coverage by 0.1%.
The diff coverage is 94.44%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5990      +/-   ##
==========================================
- Coverage   80.02%   79.91%   -0.11%     
==========================================
  Files         594      594              
  Lines       34769    34794      +25     
==========================================
- Hits        27824    27807      -17     
- Misses       6945     6987      +42
Impacted Files Coverage Δ
...ow/contrib/example_dags/example_qubole_operator.py 0% <ø> (ø) ⬆️
airflow/example_dags/docker_copy_data.py 100% <ø> (ø) ⬆️
airflow/gcp/utils/mlengine_operator_utils.py 95.34% <ø> (ø) ⬆️
airflow/example_dags/example_python_operator.py 94.44% <ø> (ø) ⬆️
...ample_dags/example_branch_python_dop_operator_3.py 75% <ø> (ø) ⬆️
airflow/example_dags/example_trigger_target_dag.py 92.3% <ø> (ø) ⬆️
airflow/example_dags/example_xcom.py 62.5% <ø> (ø) ⬆️
airflow/operators/python_operator.py 95.75% <100%> (+0.33%) ⬆️
airflow/contrib/sensors/python_sensor.py 100% <100%> (+15%) ⬆️
airflow/sensors/http_sensor.py 96.66% <100%> (ø) ⬆️
... and 12 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d31c092...b592e68. Read the comment docs.

UPDATING.md Outdated Show resolved Hide resolved
airflow/operators/python_operator.py Outdated Show resolved Hide resolved
airflow/operators/python_operator.py Outdated Show resolved Hide resolved
@BasPH BasPH merged commit dd175fa into apache:master Sep 10, 2019
@mik-laj
Copy link
Member

mik-laj commented Sep 10, 2019

I am not sure if it has sufficient documentation. Could you check that the documentation in the docs/howto/operator/python.rst file is complete?

This is very well described in the UPDATING.md file, but it is not a good place to put valuable for first user documentation.

@kaxil
Copy link
Member

kaxil commented Sep 12, 2019

Awesome work @Fokko

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants