Partition user and client tables #285

dgitis · 2023-11-10T22:26:28Z

It would be good to have daily partitioned versions of the dim_ga4__client_keys, fct_ga4__client_keys, and fct_ga4__user_ids similar to what we have with sessions so that larger sites can disable the non-partitioned models without needing to customize.

The new GA4 user export tables are day partitioned.

I believe this should be related to #251 with us adding an optional cutoff date for when to start using Google's user export (because even when enabled, they didn't immediately start receiving all of the data) and merge the two sources of data in the daily tables and then build the non-day partitioned tables from the merged daily tables.

When comparing our client_key fields with the equivalent pseudonymous_users table in the new export, I think it is best that we set up our daily tables to contain basically the same data as is in the new export renamed and unnested to our usually standard. We then try to build as much as possible from before the cutoff into that table.

For the non-partitioned tables, do we try to maintain compatibility with our existing fields? For example, the first_device_* and first_geo_* fields don't have equivalents in the GA4 export.

While it would be nice to maintain compatibility, I personally don't use most of those fields.

If others use them, then I'm happy to rebuild that downstream of the daily models.

I am resistant to rebuilding that data on the daily models because if you're trying to reduce the costs by using just the daily models despite less accurate data then you probably won't want to do the look-ups required to enhance the daily models either. Particularly if you don't use the fields all that often.

Thoughts @adamribaudo @willbryant ?

The text was updated successfully, but these errors were encountered:

adamribaudo-velir · 2023-11-12T17:28:39Z

Waiting for access to a dataset that actually holds this data before weighing in. Should be soon.

dgitis mentioned this issue Apr 13, 2024

User Export #317

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partition user and client tables #285

Partition user and client tables #285

dgitis commented Nov 10, 2023

adamribaudo-velir commented Nov 12, 2023

Partition user and client tables #285

Partition user and client tables #285

Comments

dgitis commented Nov 10, 2023

adamribaudo-velir commented Nov 12, 2023