Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing Python linking error #149

Merged
merged 14 commits into from
Dec 4, 2022
Merged

Fixing Python linking error #149

merged 14 commits into from
Dec 4, 2022

Conversation

haixuanTao
Copy link
Collaborator

@haixuanTao haixuanTao commented Nov 21, 2022

This commit is an initial draft at fixing #147. The error is due to the fact that pyo3 has linked the libpython from the compilation and not trying to use libpython that is available in LD_LIBRARY_PATH.

The current only solution to disable the linking is to use the pyo3/extension-module flag. This requires to make the python runtime-node packaged in a python library. The python runtime-node should also be fully compatible with the other operators in case we want hybrid runtime.

The issue that i'm facing is that. Due to the packaging, I have to deal with the GIL that is present from the start of dora-runtime node. This however makes the process single threaded wich is impossible.

So, I have to disable it, but when I do, I have a race condition:

Exception ignored in: <module 'threading' from '/usr/lib/python3.8/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 1373, in _shutdown
    assert tlock.locked()
AssertionError:

The issue is the same as PyO3/pyo3#1274

To fix this issue, I'm going to look again at the different step to make this work.

But this is my current investigation.

Fixes #147

This commit is an initial draft at fixing #147. The error is due to the
fact that pyo3 has linked the libpython from the compilation and not
trying to use libpython that is available in `LD_LIBRARY_PATH`.

The current only solution to disable the linking is to use the `extension-module` flag.
This requires to make the python `runtime-node` packaged in a python library.
The python `runtime-node` should also be fully compatible with the other operators in case we want hybrid runtime.

The issue that i'm facing is that. Due to the packaging, I have to deal with the `GIL` that is present from the start of
`dora-runtime` node. This however makes the process single threaded wich is impossible.

So, I have to disable it, but when I do, I have a race condition:
```bash
Exception ignored in: <module 'threading' from '/usr/lib/python3.8/threading.py'>
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 1373, in _shutdown
    assert tlock.locked()
AssertionError:
```
The issue is the same as PyO3/pyo3#1274

To fix this issue, I'm going to look again at the different step to make this work.

But this is my current investigation.
By making the stopping loop the first loop, we can avoid using `pyo3/allow_threads`, which
seems buggy.
@haixuanTao haixuanTao changed the title DRAFT: Fixing Python linking error Fixing Python linking error Nov 29, 2022
examples/python-dataflow/dataflow_without_webcam.yml Outdated Show resolved Hide resolved
binaries/runtime/Cargo.toml Show resolved Hide resolved
binaries/runtime/src/operator/python.rs Show resolved Hide resolved
binaries/runtime/src/lib.rs Show resolved Hide resolved
binaries/runtime/src/lib.rs Outdated Show resolved Hide resolved
binaries/runtime/src/lib.rs Outdated Show resolved Hide resolved
@haixuanTao haixuanTao marked this pull request as ready for review December 1, 2022 20:44
Copy link
Collaborator

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates!

As I understand it, the runtime now only works with a single Python operator. If a Python operator is combined with other operators, Python or Rust, it might not spawn the additional operators (because it waits until the Python operator finishes first). Is that correct? If so, I think we should forbid spawning Python operators together with other operators for now and throw an error in that case.

Otherwise, this PR seems good to go.

@haixuanTao
Copy link
Collaborator Author

haixuanTao commented Dec 2, 2022

Thanks for the updates!

As I understand it, the runtime now only works with a single Python operator. If a Python operator is combined with other operators, Python or Rust, it might not spawn the additional operators (because it waits until the Python operator finishes first). Is that correct? If so, I think we should forbid spawning Python operators together with other operators for now and throw an error in that case.

Otherwise, this PR seems good to go.

Yes, we could engineer a way to make this resilient but they might be some edge cases and deadlock that could happen so let's forbid spawning python operators with other operators.

@haixuanTao haixuanTao merged commit 7d0f8d9 into main Dec 4, 2022
@haixuanTao haixuanTao deleted the python_runtime branch December 4, 2022 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Errors when follow README for 0.1.0 release
2 participants