Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DECOMP.RSSD NOT converged #360

Closed
Aanai opened this issue Mar 31, 2022 · 11 comments
Closed

DECOMP.RSSD NOT converged #360

Aanai opened this issue Mar 31, 2022 · 11 comments
Assignees
Labels
bug Something isn't working

Comments

@Aanai
Copy link

Aanai commented Mar 31, 2022

Project Robyn

Describe issue

DECOMP.RSSD does not converge. I have tried increasing the number of iterations, removing the modeling window (last 6 months out of 14), aggregating daily data to weekly data to get rid of zeroes but to no avail.

Convergence on last quantile (iters 2850:3000):
DECOMP.RSSD NOT converged: [email protected] 0.066 <= 0.072 & [email protected] 0.061 > -0.029 [email protected]sd
NRMSE converged: [email protected] 0.0041 <= 0.022 & [email protected] 0.053 <= 0.093 [email protected]
sd

I am working with daily data for 14 months. Could this be because the data is insufficient? I know any good MMM will need a minimum of two years of weekly data. I did create a dummy dataset by duplicating the data I have, but that just gives me an error when I run the budget allocator for an output model.

Error in order(paid_media_vars) : argument 1 is not a vector

Provide dummy data & model configuration

budget <- openxlsx::read.xlsx("~/budget.xlsx", 1, detectDates = TRUE)
budget.df = data.frame(budget)
budget.df[is.na(budget.df)] <- "na"
head(budget.df)

SRM_InputCollect <- robyn_inputs(
  dt_input = budget.df
  ,dt_holidays = dt_prophet_holidays
  ,date_var = "Date" # date format must be "2020-01-01"
  ,dep_var = "Revenue" # there should be only one dependent variable
  ,dep_var_type = "revenue" # "revenue" or "conversion"
  ,prophet_vars = c("trend", "season", "holiday", "weekday") # "trend","season", "weekday" & "holiday"
  ,prophet_signs = c("default", "default", "default", "default")
  ,prophet_country = "US"# input one country. dt_prophet_holidays includes 59 countries by default
 ,paid_media_spends = c("gs_S","gsh_S","fbp_S", "fbrt_S") # mandatory input
 ,paid_media_signs = c("positive", "positive","positive", "positive")
# ,window_start = "2021-09-01"
# ,window_end = "2022-02-28"
  ,adstock = "geometric" # geometric, weibull_cdf or weibull_pdf.
)
print(SRM_InputCollect)

hyperparameters <- list(
  gs_S_alphas = c(0.5, 3)
  ,gs_S_gammas = c(0.3, 1)
  ,gs_S_thetas = c(0, 0.3)
  
  ,gsh_S_alphas = c(0.5, 3)
  ,gsh_S_gammas = c(0.3, 1)
  ,gsh_S_thetas = c(0, 0.3)

  ,fbp_S_alphas = c(0.5, 3)
  ,fbp_S_gammas = c(0.3, 1)
  ,fbp_S_thetas = c(0, 0.3)
  
  ,fbrt_S_alphas = c(0.5, 3)
  ,fbrt_S_gammas = c(0.3, 1)
  ,fbrt_S_thetas = c(0, 0.3)
)

OutputModels <- robyn_run(
  InputCollect = SRM_InputCollect # feed in all model specification
  , iterations = 2000 
  , trials = 5 
  , outputs = FALSE 
  ,intercept_sign = "non_negative"
  ,nevergrad_algo = "TwoPointsDE"
)
.
.
OutputCollect <- robyn_outputs(
  SRM_InputCollect, OutputModels
  , pareto_fronts = 3
  , csv_out = "pareto" 
  , clusters = TRUE
  , plot_pareto = TRUE 
  , plot_folder = robyn_object
)
print(OutputCollect)
.
.
.
AllocatorCollect <- robyn_allocator(
  InputCollect = SRM_InputCollect
  , OutputCollect = OutputCollect
  , select_model = select_model
  , scenario = "max_historical_response"
  , channel_constr_low = c(0.7, 0.7, 0.7, 0.7)
  , channel_constr_up = c(1.5, 1.5, 1.5, 1.5)
)
print(AllocatorCollect)
AllocatorCollect$dt_optimOut

Data file here: https://file.io/lSX9CGkEmxuU

Environment & Robyn version

I am running R 4.1.2 and the latest Robyn version.

@Aanai Aanai closed this as completed Mar 31, 2022
@Aanai Aanai reopened this Mar 31, 2022
@laresbernardo laresbernardo self-assigned this Mar 31, 2022
@laresbernardo
Copy link
Collaborator

laresbernardo commented Mar 31, 2022

Hi @Aanai

I was able to get your file and replicate your code, but that only creates your SRM_InputCollect object. Can you also share the part of the code to replicate your issue? I'd need the robyn_run() and robyn_allocator()parts. You can include those into your original comment by editing it. Thanks!

On the other hand, we just landed a fix on robyn_allocator(). Probably (hopefully) this error in the allocation part is going to get fixed. Please update, refresh the session, re-run the code, and let us know.

@Aanai
Copy link
Author

Aanai commented Mar 31, 2022

Hi @laresbernardo

Thanks for your response. I have included the parts of the code you asked for in the original post.

I get the error in robyn_allocator only when I use dummy data (obtained by duplicating the original data for one year). It works fine with my original data except that DECOMP.RSSD does not converge.

@laresbernardo
Copy link
Collaborator

laresbernardo commented Mar 31, 2022

I think you missed adding the following to your reproducible code:

SRM_InputCollect <- robyn_inputs(InputCollect = SRM_InputCollect,  hyperparameters = hyperparameters)

And forgot to define which is your actual select_model.

For the sake of this case, I reduced to trials = 1, selected the model 1_100_12, continued, and was able to replicate the issue. I noticed that for some reason SRM_InputCollect$paid_media_vars is NULL. That's when I noticed you did not define the following to your robyn_inputs():

robyn_inputs(..., paid_media_vars = c("gs_S","gsh_S","fbp_S", "fbrt_S"), ...)

I re-ran the code with that input and robyn_allocator() ran successfully. So now, when paid_media_vars is NULL, it'll be overwritten by paid_media_spends (which is mandatory) automatically. Thanks for catching this!

@Aanai
Copy link
Author

Aanai commented Mar 31, 2022

Thanks @laresbernardo I did define select_model, I just failed to reproduce it here. Do you have any idea why RSSD does not converge.

robyn_allocator` fails to work on the following dummy data only.

@laresbernardo
Copy link
Collaborator

laresbernardo commented Mar 31, 2022

Re the error: I've just run your script with your dummy data and wasn't able to replicate the issue. I'm 99% sure that updating to the latest version will fix the problem. Can you please check again with the newest Robyn version in a new fresh session? Should have 3.6.2 installed.

Re convergence:

  • You can try: increasing iterations, increasing hyperparameters bounds, increasing channels split granularity, adding more data to the past, adjusting the modeling window, ... there's no single answer or parameter that would always help converge.

Keep in mind:

  • Sometimes Robyn can't converge given the data provided.
  • It's not converging given the methodology, criteria, and parameters set by default (n_cuts = 20, sd_qtref = 3, med_lowb = 3).
  • Notice that in your case, your models converged in 3 of the 4 checks; the one that did NOT converge ([email protected] 0.061 > -0.029 [email protected]) is a weird case that we are internally discussing because we should actually consider the absolute median value instead of the median per se, but have to double-check if that makes sense.

@Aanai
Copy link
Author

Aanai commented Mar 31, 2022

I installed Robyn 3.6.2 and the error is gone. Thanks.

Re convergence

I've tried everything you have suggested: increased iterations, changed channel split granularity from daily to weekly, added more historical data (dummy) and adjusting modeling window. Some on Facebook's Robyn community seem to be facing the same issue.

@laresbernardo
Copy link
Collaborator

That's great!! So we can close this ticket given it's a fixed bug.

And about the convergence, I redirected that user to this same answer provided above. There's no answer-fits-all we can provide here but those tips for now. And we would love to hear from you if you think of other tips or solutions to your specific case. Please, share with us anytime!

@laresbernardo
Copy link
Collaborator

For the record: we've updated the methodology now to use the absolute medians instead of simple medians.

@F1nalFortune
Copy link

Re the error: I've just run your script with your dummy data and wasn't able to replicate the issue. I'm 99% sure that updating to the latest version will fix the problem. Can you please check again with the newest Robyn version in a new fresh session? Should have 3.6.2 installed.

Re convergence:

* You can try: increasing iterations, increasing hyperparameters bounds, increasing channels split granularity, adding more data to the past, adjusting the modeling window, ... there's no single answer or parameter that would always help converge.

Keep in mind:

* **Sometimes Robyn can't converge** given the data provided.

* It's not converging given the methodology, criteria, and parameters **set by default** (n_cuts = 20, sd_qtref = 3, med_lowb = 3).

* Notice that in your case, your models converged in 3 of the 4 checks; the one that did NOT converge ([email protected] 0.061 > -0.029 [[email protected]](mailto:[email protected])) is a weird case that we are internally discussing because we should actually **consider the absolute median value** instead of the median per se, but have to double-check if that makes sense.

@laresbernardo Tried using your suggestions with no luck.

  • Using a Weibull PDF, inreasing iterations to 10k did not help
  • Our hyperparameter bounds are the widest as suggested by the documentation

While all possible shapes are relevant, we recommend c(0.0001, 10) as bounds for shape...
When it comes to scale, we recommend a conservative bound of c(0, 0.1) for scale.

  • Splitting channels or adding new data is not possible.

In the absence of conversion, how can we expect to interpret the results?

DECOMP.RSSD NOT converged: [email protected] 0.11 > 0.11 & |[email protected]| 0.25 > -0.038

@laresbernardo
Copy link
Collaborator

Thanks a lot for sharing your detailed tests with us @F1nalFortune
We've been actually debating internally if it actually makes sense to be measuring convergence to DECOMP.RSSD error as we are, given that we could have a pretty solid model with low NRMSE error that makes sense to the business, regardless of a "higher" DECOMP.RSSD error. I've recently changed the default threshold to be a bit more permissive (from 3 to 2-fold) BUT we are evaluating a better approach. For now, it's really relevant that NRMSE does converge, and DECOMP.RSSD is a good-to-have convergence. Any suggestion on the matter is welcome.

@NoufaMukhtar
Copy link

I followed this thread and tried all fixed for Decomp.rssd to converge but no luck and I am using the latest version and below is my convergence status. Should I proceed without this converging and how can I interpret it?

DECOMP.RSSD NOT converged: [email protected] 0.21 > 0.15 & |[email protected]| 0.4 > 0.096

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants