Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation procedure of Robyn #772

Closed
ghltk opened this issue Jul 6, 2023 · 10 comments
Closed

Validation procedure of Robyn #772

ghltk opened this issue Jul 6, 2023 · 10 comments
Assignees

Comments

@ghltk
Copy link

ghltk commented Jul 6, 2023

Hi team,
Currently, we have finished building the Robyn model and we want to apply the Budget Allocation results to our next month's budget plan. If we apply the model, we need to measure its effectiveness. However, since the Robyn guide doesn't have 'how to do incremental verification of media channels', so I'm asking here.

In the Budget Allocation Onepager, we can see Total Response of Media channel. At first, we simply set Total Response as a target KPI, and tried to see how far we reached that number after a simulation period (ex. 4 weeks). However, we found that the total response obtained through the refresh model was not comparable with the total response obtained through the Initial model. Since two models used different data, they measure the media and non-media channel's contribution differently. So it cannot be used as a validation value.

[Questions]

  • Can I set Total Reponse as the target KPI for the media channel in the Budget Allocation result of the Initial Model?
  • If it's okay, how can I get the numbers to be compared for verification after the simulation period (4 weeks)?
  • Is there a way to check whether the expected response have been reached for each channel in the same way?

I'd appreciate if you could expalin abount the validation procedure of Robyn.
Even if it is not a 100% scientific verification method, I would be helpful if you could explain the currently available validation methods.
(Currently, we are not in a situation where we can proceed with Calibration and Geo Lift, so please comment except for that method.)

Thank you!😊

@gufengzhou
Copy link
Contributor

gufengzhou commented Jul 24, 2023

Hi sorry for the late reply, to your question of "budget allocator and initial model not matching", please check this answer from me on another similar issue.

Regarding validation, as explained above, you'll get the same spend share as the initial model by doing this. I just picked a random model using the simulated data :

library(dplyr)
AllocatorCollect1 <- robyn_allocator(
  InputCollect = InputCollect,
  OutputCollect = OutputCollect,
  select_model = select_model,
  date_range = "all", # Default last month as initial period
  channel_constr_low = 0.7,
  channel_constr_up = c(1.2, 1.5, 1.5, 1.5, 1.5),
  scenario = "max_response",
  export = create_files
)
OutputCollect$xDecompAgg %>% filter(solID == select_model & !is.na(spend_share)) %>% select(rn, spend_share, effect_share) %>% arrange(rn)
AllocatorCollect1$dt_optimOut %>% select(channels, initSpendShare, initResponseUnitShare)

image
As you can see in the result, spend share are the same after setting date_range to all, because the initial model onepager considers all dates, while the allocator contains 4 weeks by default.

And yes, the effect share is different, which is also explained in the linked comment above. For initial model, the effect share is just the % of all weekly avg. effect, or simply the historical share. For allocator, I need to use the weekly avg. spend to simulate the weekly avg. carryover and then simulate the weekly avg. response. It's a simulation process, NOT the historical share anymore.

@gufengzhou gufengzhou self-assigned this Jul 24, 2023
facebook-github-bot pushed a commit that referenced this issue Jul 24, 2023
- date_range in robyn_response should only refer to modelling window, not the total dataset.
- bump up dev version
@gufengzhou
Copy link
Contributor

gufengzhou commented Jul 24, 2023

Just found a bug and pushed a fix on the response function. Now you can validate between initial model, robyn_response & robyn_allocator as followed:

## comparing responses
last_period <- 1
media_sorted <- sort(InputCollect$paid_media_spends)

## get last period response from initial model
val_response_a <- OutputCollect$xDecompVecCollect %>%
  filter(solID == select_model) %>%
  select(ds, media_sorted) %>%
  tail(last_period)

## get last period response from robyn_response
val_response_b <- list()
for (i in seq_along(media_sorted)){
  Response <- robyn_response(
    InputCollect = InputCollect,
    OutputCollect = OutputCollect,
    select_model = select_model,
    metric_name = media_sorted[i],
    date_range = paste0("last_", last_period)
  )
  val_response_b["ds"] <- data.frame(ds = Response$date)
  val_response_b[media_sorted[i]] <- data.frame(response = Response$response_total)

}
val_response_b <- bind_cols(val_response_b)

## get last period response from robyn_allocator
AllocatorCollect1 <- robyn_allocator(
  InputCollect = InputCollect,
  OutputCollect = OutputCollect,
  select_model = select_model,
  date_range = paste0("last_", last_period), # Default last month as initial period
  # total_budget = NULL, # When NULL, default is total spend in date_range
  channel_constr_low = 0.7,
  channel_constr_up = c(1.2, 1.5, 1.5, 1.5, 1.5),
  # channel_constr_multiplier = 3,
  scenario = "max_response",
  export = create_files
)
val_response_c <- AllocatorCollect1$dt_optimOut %>% select(date_min, date_max, initResponseUnit)

val_response_a
val_response_b
val_response_c

When doing last_period <- 1, you can see they all align:
image

When doing last_period <- 3, initial model & response function align and outputs the historical response for every period, but allocator does runs a simulation behind and thus will use avg. carryover of the last 3 period to determine the result.
image

@tgtod002
Copy link

@gufengzhou , So given the above example looking at the results for facebook_s. to get the expected total response for the alloted period will be:

<style> </style>
  initresponseUnit Periods TOTAL
when period 1 106625.5 1 106625.5
when Period last 3 99103.13 3 297309.4

@tgtod002
Copy link

I ran the validation using my data...Trying to understand the results.

Val_response_a:

<style> </style>
48 11/27/2022 $ 161,919.87 $      2,361.20
49 12/4/2022 $ 127,123.26 $           58.60
50 12/11/2022 $ 269,868.51 $              0.90
51 12/18/2022 $ 240,025.21 $              0.01
52 12/25/2022 $ 242,438.92 $              0.00

Val_response_b"

<style> </style>
48 11/27/2022 $    161,919.87 $         2,361.20
49 12/4/2022 $    127,123.26 $              58.60
50 12/11/2022 $    269,868.51 $                0.90
51 12/18/2022 $    240,025.21 $                0.01
52 12/25/2022 $    242,438.92 $                0.00

val_response_c:

<style> </style>
  date_min date_max initResponseUnit
CSI_SPEND 1/2/2022 12/25/2022 4241.296224
WRAP_SPEND 1/2/2022 12/25/2022 402.289325

why is val_response_c so low. Does it divide by 52? (I ran the MMM for 52 weeks)

Thanks again for all your help on this.

@gufengzhou
Copy link
Contributor

The response level is calculated by a given spend on a given historical carryover/adstock. The given spend for response_c is the avg spend of the last 52 weeks, and the given carryover is the avg carryover of the 52 weeks.

Without looking into your data, the low response_c is because of the relatively high avg carryover and/or relatively low avg spend, compared to the actual daily levels. it's possible.

In your case, I suggest you to experiment with different date_range to find out the appropriate levels. We recommend using rather more recent periods instead of too far backwards to reflect your recent adstocking and saturation behaviour.

@tgtod002
Copy link

@gufengzhou, shouldn't the response C, init_response_unit = response(immediate) + carryover response?

My client's creates budget on a 52 week window. Any suggestions to be able to run allocations on this would be welcome?

@AdimDrewnik
Copy link

@gufengzhou, shouldn't the response C, init_response_unit = response(immediate) + carryover response?

My client's creates budget on a 52 week window. Any suggestions to be able to run allocations on this would be welcome?

Did you get any better understanding on the budget allocator? Can you share your findings? I have spend so much time and I still don't get what is going on here.

@gaiaderossinunatac
Copy link

gaiaderossinunatac commented Sep 16, 2024

Hello, I was trying to validate between initial model, robyn_response & robyn_allocator for the very last period (last_period <- 1) as shown from the code above by @gufengzhou:

## comparing responses
last_period <- 1
media_sorted <- sort(InputCollect$paid_media_spends)

## get last period response from initial model
val_response_a <- OutputCollect$xDecompVecCollect %>%
  filter(solID == select_model) %>%
  select(ds, media_sorted) %>%
  tail(last_period)

## get last period response from robyn_response
val_response_b <- list()
for (i in seq_along(media_sorted)){
  Response <- robyn_response(
    InputCollect = InputCollect,
    OutputCollect = OutputCollect,
    select_model = select_model,
    metric_name = media_sorted[i],
    date_range = paste0("last_", last_period)
  )
  val_response_b["ds"] <- data.frame(ds = Response$date)
  val_response_b[media_sorted[i]] <- data.frame(response = Response$response_total)

}
val_response_b <- bind_cols(val_response_b)

## get last period response from robyn_allocator
AllocatorCollect1 <- robyn_allocator(
  InputCollect = InputCollect,
  OutputCollect = OutputCollect,
  select_model = select_model,
  date_range = paste0("last_", last_period), # Default last month as initial period
  # total_budget = NULL, # When NULL, default is total spend in date_range
  channel_constr_low = 0.7,
  channel_constr_up = c(1.2, 1.5, 1.5, 1.5, 1.5),
  # channel_constr_multiplier = 3,
  scenario = "max_response",
  export = create_files
)
val_response_c <- AllocatorCollect1$dt_optimOut %>% select(date_min, date_max, initResponseUnit)

val_response_a
val_response_b
val_response_c

The initial response (val_response_a) and the response from robyn_response (val_response_b) are identical. However, I’m noticing discrepancies in the results from the robyn_allocator (val_response_c), which I’m unable to account for. Could you provide any insights into why this might be happening?
issue_github

@gufengzhou
Copy link
Contributor

Please reopen if necessary.

@AlessioPallotti
Copy link

Hello @gufengzhou! do you have any updates regarding the point of @gaiaderossinunatac? I see that you closed the issue but there is no response and I have the same problem.

Thank you very much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants