You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for putting this package together - it's great to have a scorecard package in Python.
However, it appears the weight of evidence binning algorithm only works with complete data, even though it should factor in missing data.
As a simple test, I added a column to the germancredit.csv file and included some NaN data in the new column and reran the example code. The woebin function breaks as described in other threads (e.g. #78 ). Is this on the radar for a fix?
Cheers,
Ryan
The text was updated successfully, but these errors were encountered:
I cant reproduce your issue. The package should be able to handle missing values. Please upgrade your package to the latest version on the Github and try again.
I was able to trace this back to the line 126 (in 1.9.2 available in Pypi it was 116) in woebin.py.
The specific code is: dtm = dtm[~dtm.index.isin(dtm_sv.index)].reset_index() if len(dtm_sv.index) < len(dtm.index) else None
which deletes the rows from dtm (the dataset for the final table) which are the same as the one from a list of missing values (i.e. from 0 up to w/e number of missing you have). I can't make sense of this and the reason it's still included, but this is where the rows with missing values are deleted.
As a workaround for now, I can just comment or delete this line and everything works perfectly.
Thanks for putting this package together - it's great to have a scorecard package in Python.
However, it appears the weight of evidence binning algorithm only works with complete data, even though it should factor in missing data.
As a simple test, I added a column to the germancredit.csv file and included some NaN data in the new column and reran the example code. The woebin function breaks as described in other threads (e.g. #78 ). Is this on the radar for a fix?
Cheers,
Ryan
The text was updated successfully, but these errors were encountered: