You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 11, 2021. It is now read-only.
If the local ModelDict cache is populated (either from the database [Django] or from the remote cache; the source does not affect this bug) and then, after this, the remote cache is cleared, or expires, or somehow crashes, such that the values at self.remote_cache_last_updated_key and self.remote_cache_key do not exist, the local cache will never get reloaded unless somebody changes some value that causes the remote cache to get repopulated.
Compare this original code from right after forking from Disqus:
# such so that we won't expire until the next timeout
self._local_last_updated=now
# We last checked for remote changes just now
self._last_checked_for_remote_changes=now
# Update from cache if local_cache is still empty
ifself._local_last_updatedisNone:
self._update_cache_data()
Now follow along with what happens when the remote cache has been cleared after the local cache has been populated. First, local_cache_is_invalid() returns None. This causes self.remote_cache_last_updated_key to be set in the cache. Then we obtain remote_value from the remote cache using self.remote_cache_key. But this will still be None because the remote cache has been cleared. As such, the if remote_value is not None block is not executed. self._local_cache and self._local_last_updated remain unchanged, and since the local cache had already been populated (self._local_last_updated is no longer None), self._update_cache_data() also does not get called. The local cache is never updated.
In the original code, self._local_cache would have been set to None, which would have caused self._update_cache_data() to be called.
At first glance, it seems like the easiest fix is simply to revert the changes to this method, but I'm sure there were good reasons for this change. So, instead, I propose an alternate fix:
ifremote_valueisnotNone:
self._local_cache=remote_value# We've updated from remote, so mark ourselves as# such so that we won't expire until the next timeoutself._local_last_updated=nowelse:
self._local_cache=Noneself._local_last_updated=None
This just adds three new lines to clear self._local_cache and self._local_last_updated, which will cause self._update_cache_data() to be called.
To understand the source/impact of this bug, consider our specific environment. We have hundreds of severs spread across multiple AWS availability zones. Each server hosts dozens of Django workers, all using Gargoyle feature flags. We don't use a single, central Memcached server to back Gargoyle feature flags, because that would be a SPOF. Instead, each server has its own, local Memcached that serves as the Gargoyle cache for all of the Django workers on that server. As such, when we update a feature flag, it does not update the Memcached that serves all workers with the updated Gargoyle ModelDict, just the one that servers the workers on the same machine that accepted the request to update the feature flag. To get around this, we have the Gargoyle cache expire every five minutes. This worked well when we were using Disqus, but when we switched to YPlan in order to upgrade to Python 3, it broke due to this bug. The workers only get an updated Gargoyle cache if we restart them.
I'll post a pull request after testing it thoroughly.
The text was updated successfully, but these errors were encountered:
…e cache is cleared
Per adamchainz#176, we need to make sure that the local cache gets reset if the remote cache is cleared. This commit accomplishes that.
If the local
ModelDict
cache is populated (either from the database [Django] or from the remote cache; the source does not affect this bug) and then, after this, the remote cache is cleared, or expires, or somehow crashes, such that the values atself.remote_cache_last_updated_key
andself.remote_cache_key
do not exist, the local cache will never get reloaded unless somebody changes some value that causes the remote cache to get repopulated.Compare this original code from right after forking from Disqus:
django-modeldict/modeldict/base.py
Lines 168 to 189 in 00350ad
To the current code from this project:
django-modeldict/modeldict/base.py
Lines 160 to 185 in 3046ff9
Now follow along with what happens when the remote cache has been cleared after the local cache has been populated. First,
local_cache_is_invalid()
returnsNone
. This causesself.remote_cache_last_updated_key
to be set in the cache. Then we obtainremote_value
from the remote cache usingself.remote_cache_key
. But this will still beNone
because the remote cache has been cleared. As such, theif remote_value is not None
block is not executed.self._local_cache
andself._local_last_updated
remain unchanged, and since the local cache had already been populated (self._local_last_updated
is no longerNone
),self._update_cache_data()
also does not get called. The local cache is never updated.In the original code,
self._local_cache
would have been set toNone
, which would have causedself._update_cache_data()
to be called.At first glance, it seems like the easiest fix is simply to revert the changes to this method, but I'm sure there were good reasons for this change. So, instead, I propose an alternate fix:
This just adds three new lines to clear
self._local_cache
andself._local_last_updated
, which will causeself._update_cache_data()
to be called.To understand the source/impact of this bug, consider our specific environment. We have hundreds of severs spread across multiple AWS availability zones. Each server hosts dozens of Django workers, all using Gargoyle feature flags. We don't use a single, central Memcached server to back Gargoyle feature flags, because that would be a SPOF. Instead, each server has its own, local Memcached that serves as the Gargoyle cache for all of the Django workers on that server. As such, when we update a feature flag, it does not update the Memcached that serves all workers with the updated Gargoyle
ModelDict
, just the one that servers the workers on the same machine that accepted the request to update the feature flag. To get around this, we have the Gargoyle cache expire every five minutes. This worked well when we were using Disqus, but when we switched to YPlan in order to upgrade to Python 3, it broke due to this bug. The workers only get an updated Gargoyle cache if we restart them.I'll post a pull request after testing it thoroughly.
The text was updated successfully, but these errors were encountered: