Skip to content
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.

ISSUE-16527: Due to missing Python GIL management, Python clients constructed with a non-default Logger fail to clean up their threads and often segfault when Python garbage collects them #4522

Open
sijie opened this issue Jul 11, 2022 · 0 comments

Comments

@sijie
Copy link
Member

sijie commented Jul 11, 2022

Original Issue: apache#16527


Describe the bug
If I build a Pulsar client object in Python and supply a logger= value that is not the default result of logging.getLogger(), e.g. logging.getLogger("foobar"), and if I interact with that Pulsar object from a thread in Python while another thread is running, two things happen:

Symptoms: if you interact with pulsar Client objects in threads, disconnecting/reconnecting leaks threads and other resources, and can segfault Python.

To Reproduce

  1. With an existing topic in TOPIC_NAME, run the below snippet:
import logging
import threading
from pulsar import Client


TOPIC_NAME = "persistent://chariot1/chariot_ns_sre--heartbeat/chariot_topic_heartbeat"


def _do_connect():
    logger = logging.getLogger(str(threading.currentThread().ident))
    logger.setLevel(logging.INFO)
    c = Client(service_url="pulsar://localhost:6650",
               io_threads=4,
               message_listener_threads=4,
               operation_timeout_seconds=1,
               log_conf_file_path=None,
               authentication=None,
               logger=logger,
    )
    c.get_topic_partitions(TOPIC_NAME)
    c.close()


def test_leaks():
    threads = []
    for i in range(10):
        threads.append(threading.Thread(target=_do_connect))
        threads[-1].start()
    print("Joining")
    for thread in threads:
        thread.join()
    print("Final threadcount", threading.active_count())


if __name__ == '__main__':
    test_leaks()
  1. Observe that the "final threadcount" value emitted by that code is >1.
  2. On repeat runs (I did a while python repro.py; echo iterated; done), observe that the code eventually segfaults and crashes.

Expected behavior

  1. After a thread running the Pulsar client is .joined, no side effects from its runtime should unexpectedly exist. The "final threadcount" value printed above should always be 1.
  2. Segfaults should not occur while using Pulsar clients normally.

Desktop (please complete the following information):

  • OS: MacOS 12.4 x86 (also observed segfaults on Amazon Linux and Ubuntu-in-docker on MacOS).
  • Pulsar client: 2.10.1.
@sijie sijie added the type/bug label Jul 11, 2022
@sijie sijie added the Stale label Aug 11, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant