Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the reporting of invalid input values #145

Closed
wzxiong opened this issue Feb 19, 2019 · 3 comments
Closed

Improve the reporting of invalid input values #145

wzxiong opened this issue Feb 19, 2019 · 3 comments

Comments

@wzxiong
Copy link

wzxiong commented Feb 19, 2019

When convert lightgbm to pmml, there exist error like:
expected: 'DOUBLE', got: '4.0'. error: org.jpmml.evaluator.InvalidResultException

this happen because the pmml file have structure like

<DataField name="feature_name" optype="continuous" dataType="double">
<Interval closure="closedClosed" leftMargin="0.0" rightMargin="3.0"/>
</DataField>

just remove this part <Interval closure="closedClosed" leftMargin="0.0" rightMargin="7034.5"/>, then it will work fine

@vruusmann
Copy link
Member

<Interval closure="closedClosed" leftMargin="0.0" rightMargin="3.0"/>

This Interval element declaration means: "If the value of "feature_name" input field is greater or equal than 0 and smaller or equal than 3.0, then this input value is valid and it's OK to proceed with model scoring. Otherwise, this input value is invalid, and the model cannot be scored".

See section "Values and Intervals" here:
http://dmg.org/pmml/v4-3/DataDictionary.html#xsdElement_Interval

And if you think that this restriction is not appropriate for the particular application scenario, then you should:

  1. Extended the bounds of the Interval element so that all possible input values would be classified as valid.
  2. Delete the Interval element.
  3. Keep the Interval element, but specify "it's OK to proceed with invalid input values" by setting the value of the corresponding MiningField@invalidValueTreatment attribute to "asIs" (http://dmg.org/pmml/v4-3/MiningSchema.html#xsdType_INVALID-VALUE-TREATMENT-METHOD).

TLDR: The JPMML-Evaluator library is behaving 100% correctly, by preventing you from using a model with invalid input values.

@vruusmann
Copy link
Member

My takeaway is that perhaps there should be a new, special-purpose exception type for capturing this situation.

The name of the current exception type reads as InvalidResultException, which is kind of difficult to relate to the fact that the root cause of the problem is an invalid input value.

@vruusmann vruusmann changed the title Error when input value exceed bound Improve the reporting of invalid input values Feb 19, 2019
@wzxiong
Copy link
Author

wzxiong commented Feb 19, 2019

thank you so much. maybe it's not that often to see unseen data on oot dataset

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants