This repository has been archived by the owner on Apr 26, 2024. It is now read-only.
-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Add a new version of the R30 phone-home metric, which removes a false impression of retention given by the old R30 metric #10332
Merged
Merged
Changes from 13 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
899e7c4
Add initial R30v2 tests
reivilibre 7a87cbe
Allow custom headers (e.g. User-Agent) in some test helpers
reivilibre e106b94
Add code to calculate R30v2
reivilibre 98dcf2b
Tweak and add R30v2 tests
reivilibre 8511e1a
Add R30v2 stats to the phone home data
reivilibre 248966a
Newsfile
reivilibre 39a3b8f
Explicitly report 0 when no clients of that type are present
reivilibre 2eda2a7
Remove review question
reivilibre 261b6c4
Remove some unneeded advances
reivilibre 6b6b6f2
Apply rename
reivilibre fd4f493
Rewrite R30v2 query to not use window function
reivilibre 92d215c
Cast things to BIGINT because poor Postgres cries
reivilibre 9353939
Add alias to make Postgres happy
reivilibre 6762bcf
Move multiplies to python from SQL
reivilibre 2fca561
Don't bother ordering
reivilibre 80c9187
Simplify WHEN chain
reivilibre 79db58b
Remove review comment
reivilibre 982b2d1
Clarify desired_time is secs
reivilibre 7860bc9
Factorise out store
reivilibre a865726
Clean up and standardise the advance times
reivilibre 44a0f91
Describe the structure of the dict
reivilibre 46827cf
Clean up casts and move multiplications into Python
reivilibre 899251c
Clarify comment about five minutes
reivilibre 8a4f589
Factorise store again (oops)
reivilibre File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Add a new version of the R30 phone-home metric, which removes a false impression of retention given by the old R30 metric. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -316,6 +316,126 @@ def _count_r30_users(txn): | |
|
||
return await self.db_pool.runInteraction("count_r30_users", _count_r30_users) | ||
|
||
async def count_r30v2_users(self) -> Dict[str, int]: | ||
""" | ||
Counts the number of 30 day retained users, defined as users that: | ||
- Appear more than once in the past 60 days | ||
- Have more than 30 days between the most and least recent appearances that | ||
occurred in the past 60 days. | ||
|
||
(This is the second version of this metric, hence R30'v2') | ||
|
||
Returns: | ||
A mapping of counts globally as well as broken out by platform. | ||
reivilibre marked this conversation as resolved.
Show resolved
Hide resolved
|
||
""" | ||
|
||
def _count_r30v2_users(txn): | ||
thirty_days_in_secs = 86400 * 30 | ||
now = int(self._clock.time()) | ||
sixty_days_ago_in_secs = now - 2 * thirty_days_in_secs | ||
one_day_from_now_in_secs = now + 86400 | ||
|
||
# This is the 'per-platform' count. | ||
sql = """ | ||
SELECT | ||
client_type, | ||
count(client_type) | ||
FROM | ||
( | ||
SELECT | ||
user_id, | ||
CASE | ||
WHEN | ||
user_agent IS NULL OR | ||
user_agent = '' | ||
THEN 'unknown' | ||
reivilibre marked this conversation as resolved.
Show resolved
Hide resolved
|
||
WHEN | ||
LOWER(user_agent) LIKE '%%riot%%' OR | ||
LOWER(user_agent) LIKE '%%element%%' | ||
THEN CASE | ||
WHEN | ||
LOWER(user_agent) LIKE '%%electron%%' | ||
THEN 'electron' | ||
WHEN | ||
LOWER(user_agent) LIKE '%%android%%' | ||
THEN 'android' | ||
WHEN | ||
LOWER(user_agent) LIKE '%%ios%%' | ||
THEN 'ios' | ||
ELSE 'unknown' | ||
END | ||
WHEN | ||
LOWER(user_agent) LIKE '%%mozilla%%' OR | ||
LOWER(user_agent) LIKE '%%gecko%%' | ||
THEN 'web' | ||
ELSE 'unknown' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please could we avoid having two slightly different copies of the User-Agent to platform mapping? factor it out to a constant, maybe? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm hesitant to change the pre-existing R30 mapping, but as far as I know, we only want to count Element clients for R30v2. |
||
END as client_type | ||
FROM | ||
user_daily_visits | ||
WHERE | ||
timestamp > (CAST(? AS BIGINT) * 1000) | ||
AND | ||
timestamp < (CAST(? AS BIGINT) * 1000) | ||
GROUP BY | ||
user_id, | ||
client_type | ||
HAVING | ||
max(timestamp) - min(timestamp) > (CAST(? AS BIGINT) * 1000) | ||
) AS temp | ||
GROUP BY | ||
client_type | ||
ORDER BY | ||
client_type | ||
reivilibre marked this conversation as resolved.
Show resolved
Hide resolved
|
||
; | ||
""" | ||
|
||
# We initialise all the client types to zero, so we get an explicit | ||
# zero if they don't appear in the query results | ||
results = {"ios": 0, "android": 0, "web": 0, "electron": 0} | ||
txn.execute( | ||
sql, | ||
(sixty_days_ago_in_secs, one_day_from_now_in_secs, thirty_days_in_secs), | ||
) | ||
|
||
for row in txn: | ||
if row[0] == "unknown": | ||
continue | ||
results[row[0]] = row[1] | ||
|
||
# This is the 'all users' count. | ||
sql = """ | ||
SELECT COUNT(*) FROM ( | ||
SELECT | ||
1 | ||
FROM | ||
user_daily_visits | ||
WHERE | ||
timestamp > (CAST(? AS BIGINT) * 1000) | ||
AND | ||
timestamp < (CAST(? AS BIGINT) * 1000) | ||
reivilibre marked this conversation as resolved.
Show resolved
Hide resolved
|
||
GROUP BY | ||
user_id | ||
HAVING | ||
max(timestamp) - min(timestamp) > (CAST(? AS BIGINT) * 1000) | ||
) AS r30_users | ||
""" | ||
|
||
txn.execute( | ||
sql, | ||
(sixty_days_ago_in_secs, one_day_from_now_in_secs, thirty_days_in_secs), | ||
) | ||
row = txn.fetchone() | ||
if row is None: | ||
results["all"] = 0 | ||
else: | ||
results["all"] = row[0] | ||
|
||
return results | ||
|
||
return await self.db_pool.runInteraction( | ||
"count_r30v2_users", _count_r30v2_users | ||
) | ||
|
||
def _get_start_of_day(self): | ||
""" | ||
Returns millisecond unixtime for start of UTC day. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these repeated calls to
hs.get_datastore()
are fugly :/.It's fine for now, for consistency with the rest of the code, but at some point it would be nice to factor this out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
have a follow-up PR ready to go after this one