-
Notifications
You must be signed in to change notification settings - Fork 363
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions about influence of factor vars on Prophet and waterfall contribution #384
Comments
Hey @Steven-Livingstone prophet and factors vars are used right after prophets decomposition and inclusion into the model as extra coefficients just as you said in your last comment. In this case it may be worth to change |
Hi @Leonelsentana, thank-you for those suggestions! While working on them could you link to where I would be able to "add the border control event as a regressor in prophet". Is this what you're referring to in the "Pro-tip: Customize holiday & event information" of the analysts-guide-to-MMM
So far |
Please find the link about how to do it on prophet here: https://facebook.github.io/prophet/docs/seasonality,_holiday_effects,_and_regressors.html#additional-regressors |
Change intercept_sign parameter to "unconstrained"Unfortunately this did not have much of an effect, with the waterfall chart still showing counter-intuitive results, albeit with a slightly negative intercept. These tests were also repeated with Attempts to add additional regressors to ProphetLooking through Prophet's reference manual on cran, the prophet(
df = NULL,
growth = "linear",
changepoints = NULL,
n.changepoints = 25,
changepoint.range = 0.8,
yearly.seasonality = "auto",
weekly.seasonality = "auto",
daily.seasonality = "auto",
holidays = NULL,
seasonality.mode = "additive",
seasonality.prior.scale = 10,
holidays.prior.scale = 10,
changepoint.prior.scale = 0.05,
mcmc.samples = 0,
interval.width = 0.8,
uncertainty.samples = 1000,
fit = TRUE,
... # Additional arguments, passed to fit.prophet
) The only way (I can see) to include additional regressors in prophet is via the None the less, I can see that Robyn already adds all factor vars as prophet regressors by using the same if (!is.null(factor_vars) && length(factor_vars) > 0) {
dt_ohe <- as.data.table(model.matrix(y ~ ., dt_regressors[, c("y", factor_vars), with = FALSE]))[, -1]
ohe_names <- names(dt_ohe)
# Adds factor_vars
for (addreg in ohe_names) modelRecurrence <- add_regressor(modelRecurrence, addreg)
# Adds context_vars and paid_media_spends
dt_ohe <- cbind(dt_regressors[, !factor_vars, with = FALSE], dt_ohe)
mod_ohe <- fit.prophet(modelRecurrence, dt_ohe) Context_vars and paid_media_spends are also included as prophet regressors using
Add Border_Closure_Control as a Prophet HolidayThis had the effect of swinging the Follow on QuestionsFrom analyzing how the above code, it seem that Robyn uses the Is this reasonable (I'm willing to be completely off the mark though :P )? |
Hi @Steven-Livingstone apologies about the delay in my response, I have been pretty busy lately. Look, context_vars are added as regressors only when they're factors, this way we convert factors to numeric vars for easy processing leveraging prophet. We believe that continuous numeric variables as context_vars can reflect better the variance in time than factor context_vars, therefore you may want to try that, this will not go under regressors in prophet, just ridge regression directly. Hope it helps! |
Hi @Leonelsentana, no problem :) Honestly the support you (and the Robyn dev team) are providing is simply awesome! Thank you all so much. Your suggestion seems to be working as intended given my testing (see the graph from above in "Example: Adding Border_Closure_Control as just a context_var") Here, The only nitpick I have is that I cant seem to find the reason for it working in the code. Below is the code that I believe "convert[s] factors to numeric vars for easy processing leveraging prophet." With some of my comments added. # If there exists factor_vars, convert to numeric vars and add them using add_regressor fn
if (!is.null(factor_vars) && length(factor_vars) > 0) {
dt_ohe <- as.data.table(model.matrix(y ~ ., dt_regressors[, c("y", factor_vars), with = FALSE]))[, -1]
ohe_names <- names(dt_ohe)
for (addreg in ohe_names) modelRecurrence <- add_regressor(modelRecurrence, addreg)
dt_ohe <- cbind(dt_regressors[, !factor_vars, with = FALSE], dt_ohe)
mod_ohe <- fit.prophet(modelRecurrence, dt_ohe)
dt_forecastRegressor <- predict(mod_ohe, dt_ohe)
forecastRecurrence <- dt_forecastRegressor[, str_detect(
names(dt_forecastRegressor), "_lower$|_upper$",
negate = TRUE
), with = FALSE]
for (aggreg in factor_vars) {
oheRegNames <- na.omit(str_extract(names(forecastRecurrence), paste0("^", aggreg, ".*")))
forecastRecurrence[, (aggreg) := rowSums(.SD), .SDcols = oheRegNames]
get_reg <- forecastRecurrence[, get(aggreg)]
dt_transform[, (aggreg) := scale(get_reg, center = min(get_reg), scale = FALSE)]
}
# Else, simply use dt_regressors for prophet decomp
} else {
mod <- fit.prophet(modelRecurrence, dt_regressors)
forecastRecurrence <- predict(mod, dt_regressors)
} Looking above to line 688 to see where dt_regressors <- cbind(recurrence, subset(dt_transform, select = c(context_vars, paid_media_spends))) Here we see Perhaps its my inexperience with Prophet, but is my interpretation of this code correct? |
Hi @Steven-Livingstone apologies for the delay again, we are adding those columns but we are not using them at any point in the prophet fit, prophet will just take ds and y from the dataframe, so no context_vars. Apologies for the confusion! |
Project Robyn
Describe issue
Border lock-downs due to COVID have had an obvious and negative impact on the dependent variable. To measure this impact, Border lock-downs have been encoded as a categorical context variable in Robyn(0=open_border or 1=closed_border). Lets call this
Border_Closure_Control
.I know (by eyeballing the graph) that closing the border has had a prolonged negative impact on dep_var and so added
Border_Closure_Control
as a control var. However, my results differ significantly when also addingBorder_Closure_Control
as a factor_var V.S. when I don't.Why would excluding
Border_Closure_Control
as a factor_var give the expected results of a negative contribution(see examples below) and how are context and factor vars used in Prophets decomposition?I also ran tests with just
Border_Closure_Control
as the only context variable (with several media vars) which produced the similar results. Perhaps I have misunderstood how Prophet is using context and factor vars?Thank-you for any help or insight you can offer! Cheers :)
Further details
Note:
I followed these documentation recommendations when including
Border_Closure_Control
The following models differ only in their use of
Border_Closure_Control
. Almost all candidate models from each Robyn run share the same characteristics concerningBorder_Closure_Control
.Example: Adding Border_Closure_Control as both a context_var and factor_var
Border_Closure_Control
has been included in the Prophet's deseasonalization plot (above).Border_Closure_Control
has a very positive contribution towards sales. This does not make sense to me.Border_Closure_Control
used default in paid_media_signs.Example: Adding Border_Closure_Control as just a context_var
Border_Closure_Control
is missing from Prophet's deseasonalization plot.Border_Closure_Control
now has a very negative contribution towards sales.Border_Closure_Control
used default in paid_media_signs.Example: Adding Border_Closure_Control as both a context_var and factor_var AND forcing Border_Closure_Control to be negative.
Border_Closure_Control
is included as a factor_var it is also included Prophet's deseasonalization plot.Border_Closure_Control
now has zero contribution towards sales.Border_Closure_Control
used negative in paid_media_signs.Investigations
prophet_decomp
, here seems to be the determining logic that decides if factor_vars gets one-hot-encoded and added to Prophet's deseasonalization process.Border_Closure_Control
in prophet_decomp.png.I understand that trend and seasonality are used as extra coefficients in the ridge regression model in order to give equal opportunity to explain the dependent variable, but how does deseasonalizing the dep_var relate to the context and factor vars?
Environment & Robyn version
R version = 3.6.4
The text was updated successfully, but these errors were encountered: