Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design of clear_task_threads #1057

Closed
bkamins opened this issue Dec 4, 2022 · 3 comments · Fixed by JuliaServices/ConcurrentUtilities.jl#10
Closed

Design of clear_task_threads #1057

bkamins opened this issue Dec 4, 2022 · 3 comments · Fixed by JuliaServices/ConcurrentUtilities.jl#10

Comments

@bkamins
Copy link
Member

bkamins commented Dec 4, 2022

In

function clear_thread_states()
we have:

function clear_thread_states()
    # only clear thread states in the workflow where a user is running
    # CSV.File from the REPL at the top-level; if we're already spawned
    # in a task, we don't want to mess up thread task states
    if current_task() == Base.roottask
        Threads.@threads :static for _ in 1:Threads.nthreads()
            Timer(_Returns(nothing), 0; interval = 1)
        end
    end
end

@quinnj I do not understand two things:

  • why this gets executed if we are in root task only? The same problem can be if we are not in root task;
  • I do not understand why we use Timer here? I understand why it is proposed in Thread-local current_task keeps garbage alive JuliaLang/julia#40626 (as it would be a solution in general for Base Julia) but why call this function every second forever from CSV.jl? I would think it is enough to call it once.
@quinnj
Copy link
Member

quinnj commented Dec 7, 2022

Ok, I've dug back into this and here's my understanding:

  • There are 2 ways that references to things are kept around unexpectedly:
    • By being returned from a Threads.@spawn call; even if the object is not referenced in user code, the created Task will keep a reference to it as what was returned from running the task
    • Any inputs that get captured in the closure created to run a Threads.@spawn call; again, the Task keeps the reference to these, even if they aren't further referenced in user code
  • Because of the 2 ways to keep references above, once a Task has finished running, each physical thread keeps a reference to its "current task", which includes its most recently run Task if another has not yet been scheduled; by keeping a reference to a Task, it keeps references to the closure inputs and returned results as mentioned above

One easy mitigation, if possible, is to ensure that Threads.@spawn calls return nothing. That avoids at least one case of a reference to something being kept around unexpectedly.

The closure inputs reference, however, is unavoidable as far as I understand. This leads us to consider alternative solutions, like JuliaLang/julia#47405, which basically tries to treat Tasks that have finished as weak refs, or the clear_thread_states as we've attempted in CSV.jl.

The original definition in CSV.jl was:

function clear_thread_states()
    Threads.@threads :static for _ in 1:Threads.nthreads()
            Timer(_Returns(nothing), 0; interval = 1)
        end
end

But @iamed2 pointed out that this causes problems when making a call like asyncmap(CSV.File, files). The reason is: threads actually rely on being able to reference their "current task" when, i.e. they have multiple Tasks they are running that may call @async themselves and the physical thread is switching between Tasks to run. If CSV.jl, however, "clears the thread state" in the middle of a nested threaded call, it messes up Julia's internal threading state.

The follow up definition tried to avoid this by checking 1st if the current_task() == Base.roottask, which is asking "are we running from a top-level user call" and not from an asynchronous @async or Threads.@spawn call, to try and avoid messing up other asynchronous calls.

Having further reviewed, I agree that having the Timer call is wrong. We're basically spawning a never-ending Timer that wakes up every second for every CSV.File call. We could potentially run this once in the CSV.jl __init__ function, but more appropriate I believe is to just have the spawned task be a simple nothing.

Another alternative that I may try is trying to do the weak referencing of closure inputs ourselves; so we would put all of our closure input arguments into a WeakRefDict manually, and that would be passed to the Threads.@spawn call. This would work in the CSV.jl case I believe because we are always calling @sync around our Threads.@spawn calls, so we could just include a GC.@preserve around the @sync block to ensure our arguments are preserved for the life of the @sync block, but able to be garbage collected afterwards.

@quinnj
Copy link
Member

quinnj commented Dec 7, 2022

Also did a similar writeup + show of a potential new approach here

quinnj added a commit that referenced this issue Dec 7, 2022
Most complete explanation is [here](JuliaLang/julia#40626 (comment)).

Also discussed [here](#1057).

This PR proposes an alternative solution to `clear_thread_states` where that approach
can be problematic (interferring with global thread state, not working as expected in nested spawned
tasks, etc.). The previous definition also started unending `Timer` tasks that could build up over time.

The approach in this PR is to wrap spawned task closure arguments in `WeakRef` to allow them to be
GCed as expected once the tasks are finished.
@quinnj
Copy link
Member

quinnj commented Dec 7, 2022

PR up: #1058

quinnj added a commit to JuliaServices/ConcurrentUtilities.jl that referenced this issue Dec 7, 2022
…e argumetns in WeakRef

Fixes JuliaData/CSV.jl#1057.
Works around current Julia limitation here: JuliaLang/julia#40626.

`@wkspawn` acts just like `Threads.@spawn`, except for mutable, interpolated arguments
in the spawned expression, they will also be transparently wrapped as `WeakRef`s,
then immediately unwrapped within the spawn block. This avoids the `Task` closure
capturing mutable arguments in a more permanent way and preventing their collection
later, even after there are no more program references to the mutable argument.
quinnj added a commit to JuliaServices/ConcurrentUtilities.jl that referenced this issue Dec 7, 2022
#10)

* Add new `@wkspawn` macro for wrapping mutable `Threads.@spawn` closure argumetns in WeakRef

Fixes JuliaData/CSV.jl#1057.
Works around current Julia limitation here: JuliaLang/julia#40626.

`@wkspawn` acts just like `Threads.@spawn`, except for mutable, interpolated arguments
in the spawned expression, they will also be transparently wrapped as `WeakRef`s,
then immediately unwrapped within the spawn block. This avoids the `Task` closure
capturing mutable arguments in a more permanent way and preventing their collection
later, even after there are no more program references to the mutable argument.

* Compat

* fix

* try lots of gc

* Fix

* Run with threads
quinnj added a commit that referenced this issue Dec 8, 2022
…#1058)

* Use WeakRefs for spawned tasks to avoid holding unexpected references

Most complete explanation is [here](JuliaLang/julia#40626 (comment)).

Also discussed [here](#1057).

This PR proposes an alternative solution to `clear_thread_states` where that approach
can be problematic (interferring with global thread state, not working as expected in nested spawned
tasks, etc.). The previous definition also started unending `Timer` tasks that could build up over time.

The approach in this PR is to wrap spawned task closure arguments in `WeakRef` to allow them to be
GCed as expected once the tasks are finished.

* Try putting gc preserve inside Threads.spawn block

* Outside GC preserve manual sync block

* Make Context mutable so it gets preserved properly

* Only wrap in WeakRef if mutable

* oops

* Use `@wkspawn` from WorkerUtilities.jl package

* 1.6 compat
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants