Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error: Index larger than maximal for convex hull plots #77

Open
sharpant opened this issue Aug 1, 2018 · 6 comments
Open

Error: Index larger than maximal for convex hull plots #77

sharpant opened this issue Aug 1, 2018 · 6 comments

Comments

@sharpant
Copy link

sharpant commented Aug 1, 2018

I have an object (ans1X) trained in xgboost. The dataset has 82,527 rows. I am able to use partial to make a partial dependency plot with one variable. However, when I try and make a convex hull plot (with two variables) using the below code:

p1=partial(ans1X, pred.var = c("fix_eff7", "fix_eff2"), plot = TRUE, chull = TRUE,train=train)

I get the following error:

Error in intI(i, n = x@Dim[1], dn[[1]], give.dn = FALSE) : index larger than maximal 82527

Could you please help out?

@bgreenwell
Copy link
Owner

It's hard to say without a reproducible example. To narrow down the issue, can you make sure that train is a data frame in the argument train = train?

@bgreenwell
Copy link
Owner

On second thought, are you using a sparse matrix from the Matrix package?

@sharpant
Copy link
Author

sharpant commented Aug 1, 2018

Yes, I am using a sparse matrix from Matrix package. If I instead use a data frame for `train=train', with the ans1X xgBoost object being trained with the sparse matrix, then I get the following error:

Error in names(pd.df) <- c(pred.var, "yhat") : 'names' attribute [3] must be the same length as the vector [2]

@bgreenwell
Copy link
Owner

It should work with any combination of training: https://github.com/bgreenwell/pdp/blob/master/slowtests/slowtests-xgboost.R. I'll see if I can find where the true issue is. The error you are getting is thrown from the Matrix package. It would be easier if you could provide a reproducible example? Maybe sample the data if it's not sensitive
?

@sharpant
Copy link
Author

sharpant commented Aug 1, 2018

The sparse matrix works with partial when there is only variable. The problem occurs when there are two variables and I am using chull=TRUE.

I now created the ans1X xgboost object using a dataframe, which was a much slower process (but continued to use the sparse matrix format for cross validation in the previous step), and in partial I used train=train as a dataframe, and now I am getting the convex hull plot. So I found a temporary solution to the problem. Thanks for your help! As the data is sensitive, I am unable to share a reproducible example using that.

@bgreenwell
Copy link
Owner

As long as you have a reasonable workaround for now! I'll dig into this issue when I find some free time soon and see if I can't figure out the exact issue (might just be a subsetting issue). Will get back to you soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants