Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add log streaming to papermill plugin #1129

Merged
merged 33 commits into from
Apr 19, 2023
Merged
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
cf16ea6
checkpoint
Mar 31, 2022
eff0a7d
Experimental implementation works well. Instead of messing with the c…
Mar 31, 2022
afa9c5f
fix spacing
Mar 31, 2022
369eca0
remove breakpoint
Mar 31, 2022
a29367e
Output ctx.working_directory as single output
Mar 31, 2022
a64774b
more doc strings
Mar 31, 2022
7829281
more comments
Mar 31, 2022
15453aa
Added comments
Apr 1, 2022
0abcfdc
Add tests and test files from other branch
Apr 1, 2022
e0efff6
minor fix, have function return the instance rather than create. It s…
Apr 6, 2022
4abd447
fix tests
Apr 11, 2022
6b82738
Merge branch 'flyteorg:master' into master
myz540 Apr 11, 2022
6e2964b
remove set flags not supported by sh
Apr 12, 2022
d16b54e
fix spellcheck lint errors
Apr 12, 2022
9a989de
don't have a windows equivalent test script so bypassing those tests …
Apr 15, 2022
e217806
Add typing to make_export_string_from_env_dict params
Apr 18, 2022
2bce64e
fixup doc string
Apr 18, 2022
90a6b17
Merge branch 'flyteorg:master' into master
myz540 Apr 25, 2022
801c852
Merge branch 'flyteorg:master' into master
myz540 Apr 27, 2022
3d26075
Refactored the new behavior out into a new class. Did not change impl…
Apr 27, 2022
ce4603f
Address linter issues and up test cov
May 2, 2022
ead6495
Skip test on windows, no equivalent script
May 2, 2022
2ebeecd
Run black and isort, address SC2236
May 3, 2022
e4c121d
Refactored class name to remove _, make utility function require name…
May 3, 2022
215f48c
fix utility function
May 3, 2022
5b75691
fix utility function call in tests
May 3, 2022
53e88f6
Fix isort on test_shell.py
May 4, 2022
6c755ca
Merge branch 'flyteorg:master' into master
myz540 May 4, 2022
5ec62d6
Merge branch 'flyteorg:master' into master
CalvinLeather Jul 28, 2022
f1ed74c
Merge branch 'flyteorg:master' into master
CalvinLeather Aug 10, 2022
6d5e67a
adding logging settings to papermill plugin
Aug 10, 2022
8b85e13
set level to info logging
Aug 10, 2022
c854024
improved comments/docs
Aug 10, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 21 additions & 1 deletion plugins/flytekit-papermill/flytekitplugins/papermill/task.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
import json
import logging
import os
import sys
import typing
from typing import Any

Expand Down Expand Up @@ -84,6 +86,13 @@ class NotebookTask(PythonInstanceTask[T]):

Users can access these notebooks after execution of the task locally or from remote servers.

.. note:

By default, print statements in your notebook won't be transmitted to the pod logs/stdout. If you would
like to have logs forwarded as the notebook executes, pass the stream_logs argument. Note that notebook
logs can be quite verbose, so ensure you are prepared for any downstream log ingestion costs
(e.g., cloudwatch)

.. todo:

Implicit extraction of SparkConfiguration from the notebook is not supported.
Expand Down Expand Up @@ -112,6 +121,7 @@ def __init__(
name: str,
notebook_path: str,
render_deck: bool = False,
stream_logs: bool = False,
task_config: T = None,
inputs: typing.Optional[typing.Dict[str, typing.Type]] = None,
outputs: typing.Optional[typing.Dict[str, typing.Type]] = None,
Expand All @@ -132,6 +142,16 @@ def __init__(
self._notebook_path = os.path.abspath(notebook_path)

self._render_deck = render_deck
self._stream_logs = stream_logs

# Send the papermill logger to stdout so that it appears in pod logs. Note that papermill doesn't allow
# injecting a logger, so we cannot redirect logs to the flyte child loggers (e.g., the userspace logger)
# and inherit their settings, but we instead must send logs to stdout directly
if self._stream_logs:
papermill_logger = logging.getLogger("papermill")
papermill_logger.addHandler(logging.StreamHandler(sys.stdout))
# Papermill leaves the default level of DEBUG. We increase it here.
papermill_logger.setLevel(logging.INFO)

if not os.path.exists(self._notebook_path):
raise ValueError(f"Illegal notebook path passed in {self._notebook_path}")
Expand Down Expand Up @@ -207,7 +227,7 @@ def execute(self, **kwargs) -> Any:
"""
logger.info(f"Hijacking the call for task-type {self.task_type}, to call notebook.")
# Execute Notebook via Papermill.
pm.execute_notebook(self._notebook_path, self.output_notebook_path, parameters=kwargs) # type: ignore
pm.execute_notebook(self._notebook_path, self.output_notebook_path, parameters=kwargs, log_output=self._stream_logs) # type: ignore

outputs = self.extract_outputs(self.output_notebook_path)
self.render_nb_html(self.output_notebook_path, self.rendered_output_path)
Expand Down