-
Notifications
You must be signed in to change notification settings - Fork 335
[2.0] Should Lock default to using contextvars? #1040
Comments
This does not appear to be straightforward. The Python docs claim ContextVars should only be created at the top level of a module, that is, outside of a class. Multiple Tasks will see different values for the ContextVar, but multiple Lock instances within the same Task, or multiple Lock instances outside of a Task context, would share the same ContextVar. That doesn't work with the current design of Lock. Putting a ContextVar instance inside of each Lock instance the same way that redis-py uses thread-local storage works, but the docs claim these variables won't get garbage-collected. 🤔 |
@Andrew-Chen-Wang Have you looked into this much? I'm not sure how important this garbage collection reference in the docs is, or how different this is than the situation we already had with thread-local storage. |
@abrookins take a look at NoneGG/aredis#120 |
I wonder why we even need a ContextVar at all. 🤔 redis-py was trying to solve the problem that multiple threads could try to change the same Lock instance. We want to solve the problem that multiple Tasks might try to change the same Lock instance. Right? Wouldn't using an asyncio.Lock work here, so that only one Task could change the state of the Lock instance at a given time? ... Without any memory leaks. |
I was about to say use asyncio.Lock, but without looking at the code at all, I assumed this was a Redis lock itself like https://github.com/joanvila/aioredlock Edit: currently looking at an async database implementation here encode/databases#230 |
Right, it's a lock backed by a Redis key. I see in that thread that they did not actually resolve the question of whether the garbage collection problem mattered. It seems like using asyncio.Lock to lock the internal state of the (heh, Redis-backed) Lock instance would achieve the goal without leaking memory with ContextVar instances inside the (Redis-backed) Lock instance. I've been working on a PR that uses ContextVar, but let me refashion it to use asyncio.Lock to see if that makes more sense. |
Oh, right, locks don't do the same thing. With thread-local storage, if two threads were using the same (Redis) Lock instance, one thread acquiring the Redis Lock would block the other thread. That's because each thread would use a different token value, which is what Lock uses to distinguish if the current instance "owns" the protected resource currently. I.e. current Redis value == Lock token. If we did use asyncio.Locks inside the Redis Lock, two Tasks sharing the same Lock wouldn't block each other, they would see the same token value. So we either do need to use ContextVar in the form that the aredis folks did, which is to say against the advice of the Python docs, or else...? I'm not sure what the alternative is. |
The Lock implementation in redis-py uses thread-local storage so that multiple threads using the same Lock instance can acquire the Lock from each other. Thread-local storage ensures that each thread sees a different token value. Thread-local storage does not apply in the Task-based concurrency that asyncio programs use. To achieve a similar effect, we need to embed a ContextVar instance within each Lock and store the Lock instance's token withint he ContextVar instance. This allows every Task that uses the same Lock instance to see a different token. Thus, if both Task A and Task B refer to Lock 1, Task A can "acquire" Lock 1 and block Task B from acquiring the same Lock until Task A "releases" the Lock. NOTE: The Python documentation suggests only storing ContextVar instances in the top-level module scope due to issues around garbage collection. That won't work in the current design of Lock. For lack of a better alternative, and to preserve the original design of Lock taken from redis-py, we have created instances of ContextVar within instances of Lock. Fixes #1040.
@Andrew-Chen-Wang I assume you already get pinged out the wazoo, but: I opened a PR with my changes to use ContextVar. See linked. |
Thanks for opening the PR and sorry for the lack of responses. Just a quick thought: Is asyncio.Queue feasible to store the aforementioned Lock tokens and continuously put and pop? asyncio.Queue is not thread-safe, but janus from aio-libs is a thread-safe asyncio Queue. Also after looking around a bunch of async repos like FastAPI (comment, though I think that's actually a correct use case) and db wrappers like encode/databases, I keep seeing everyone recommend using contextvars. In the latter repo, they saw a problem with contextvars but haven't created a PR to resolve it yet (comment). |
The FastAPI comment sounds like the advised way to use ContextVars, based on the docs. I.e., a single top-level module instance. As for the second link, that appears to solidly reject my PR and the use of ContextVars in general for this problem, because of the text quoted:
encode/databases did exactly what I'm proposing to do, but it appears not to work correctly because Tasks copy the current context -- thus a new Task might see the same token as another Task, and the Lock would not work as intended. Update: Jeeze, I'm still wondering if there's a way to use ContextVar correctly here. I'll have to look at this again after I take a break, maybe draw some flow charts. 😂 |
So with a Queue, what do you imagine? Something like this?
That might work. If the queue's max size is 1, it can only ever hold one Lock token. So whichever blocked Task (waiting on the release of the Lock) pulls out a token first "wins." |
Basically, yes, we hope? I guess the problem would be the order; there's a probability that a token can never be put into the queue if we use max size 1 unless I'm not understanding the put correctly + locking mechanism. I guess it'd be worth a try? Btw it was just a guess; no idea if it'd work and did not think it thoroughly through. |
Don't mean to beat a dead horse but I do think you can use contextvars (top-level module) without problems when starting nested tasks. Maybe I misunderstood the problem or the order of concurrent calls |
Initially the module-level contextvar-based token must be not-set, and when entering and exiting async contexts ( |
Tested with my own fork of the lock module, but no lock acquirers can actually acquire the lock, just like before patching with contextvars. 🤔 |
Has anyone experimented with just creating a contextvar with a value as a dictionary, then for each lock, we assign it a unique id such as |
threading.local or contextvars.ContextVar("lock") should be global then it works are you still sloving this problem? |
I've never used the Lock class in redis-py. But when glancing at the code, I noticed that the default is to store the lock state in thread-local storage (so that two threads can each independently acquire the lock without interfering with each other).
For aioredis, the standard concurrency mechanism is tasks, not threads, so it probably makes sense to default to using
contextvars
(so that two tasks can independently acquire the lock).The text was updated successfully, but these errors were encountered: