From 558e4fa71a28f55481f6857c30dd836cf6f6fa09 Mon Sep 17 00:00:00 2001
From: "github-actions[bot]"
 <41898282+github-actions[bot]@users.noreply.github.com>
Date: Fri, 24 May 2024 08:22:20 -0300
Subject: [PATCH] [8.2.x] Add thread safety section to flaky test docs (#12362)

Co-authored-by: Nathan Goldbaum <nathan.goldbaum@gmail.com>
---
 AUTHORS                      |  1 +
 changelog/12356.doc.rst      |  2 ++
 doc/en/explanation/flaky.rst | 22 +++++++++++++++++++---
 3 files changed, 22 insertions(+), 3 deletions(-)
 create mode 100644 changelog/12356.doc.rst

diff --git a/AUTHORS b/AUTHORS
index cc53ce10d4f..18c60750e30 100644
--- a/AUTHORS
+++ b/AUTHORS
@@ -289,6 +289,7 @@ Mike Lundy
 Milan Lesnek
 Miro Hrončok
 mrbean-bremen
+Nathan Goldbaum
 Nathaniel Compton
 Nathaniel Waisbrot
 Ned Batchelder
diff --git a/changelog/12356.doc.rst b/changelog/12356.doc.rst
new file mode 100644
index 00000000000..312c26d3298
--- /dev/null
+++ b/changelog/12356.doc.rst
@@ -0,0 +1,2 @@
+Added a subsection to the documentation for debugging flaky tests to mention
+lack of thread safety in pytest as a possible source of flakyness.
diff --git a/doc/en/explanation/flaky.rst b/doc/en/explanation/flaky.rst
index 41cbe847989..cb6c3983424 100644
--- a/doc/en/explanation/flaky.rst
+++ b/doc/en/explanation/flaky.rst
@@ -18,7 +18,7 @@ System state
 
 Broadly speaking, a flaky test indicates that the test relies on some system state that is not being appropriately controlled - the test environment is not sufficiently isolated. Higher level tests are more likely to be flaky as they rely on more state.
 
-Flaky tests sometimes appear when a test suite is run in parallel (such as use of pytest-xdist). This can indicate a test is reliant on test ordering.
+Flaky tests sometimes appear when a test suite is run in parallel (such as use of `pytest-xdist`_). This can indicate a test is reliant on test ordering.
 
 -  Perhaps a different test is failing to clean up after itself and leaving behind data which causes the flaky test to fail.
 - The flaky test is reliant on data from a previous test that doesn't clean up after itself, and in parallel runs that previous test is not always present
@@ -30,9 +30,22 @@ Overly strict assertion
 
 Overly strict assertions can cause problems with floating point comparison as well as timing issues. :func:`pytest.approx` is useful here.
 
+Thread safety
+~~~~~~~~~~~~~
 
-Pytest features
-^^^^^^^^^^^^^^^
+pytest is single-threaded, executing its tests always in the same thread, sequentially, never spawning any threads itself.
+
+Even in case of plugins which run tests in parallel, for example `pytest-xdist`_, usually work by spawning multiple *processes* and running tests in batches, without using multiple threads.
+
+It is of course possible (and common) for tests and fixtures to spawn threads themselves as part of their testing workflow (for example, a fixture that starts a server thread in the background, or a test which executes production code that spawns threads), but some care must be taken:
+
+* Make sure to eventually wait on any spawned threads -- for example at the end of a test, or during the teardown of a fixture.
+* Avoid using primitives provided by pytest (:func:`pytest.warns`, :func:`pytest.raises`, etc) from multiple threads, as they are not thread-safe.
+
+If your test suite uses threads and your are seeing flaky test results, do not discount the possibility that the test is implicitly using global state in pytest itself.
+
+Related features
+^^^^^^^^^^^^^^^^
 
 Xfail strict
 ~~~~~~~~~~~~
@@ -123,3 +136,6 @@ Resources
 
   * `Flaky Tests at Google and How We Mitigate Them <https://testing.googleblog.com/2016/05/flaky-tests-at-google-and-how-we.html>`_ by John Micco, 2016
   * `Where do Google's flaky tests come from? <https://testing.googleblog.com/2017/04/where-do-our-flaky-tests-come-from.html>`_  by Jeff Listfield, 2017
+
+
+.. _pytest-xdist: https://github.com/pytest-dev/pytest-xdist