Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Customizing the "missing value handling"-mode of models #25

Open
rajpalparyani opened this issue Jul 18, 2017 · 2 comments
Open

Customizing the "missing value handling"-mode of models #25

rajpalparyani opened this issue Jul 18, 2017 · 2 comments

Comments

@rajpalparyani
Copy link

Hi,

I am using "jpmml-sparkml, version 1.2.4" to generate pmml models in spark (using Scala) and saving that output to local file system, but can't figure out how to set the following properties

<xs:attribute name="missingValueStrategy" type="MISSING-VALUE-STRATEGY" default="none"/>
<xs:attribute name="missingValuePenalty" type="PROB-NUMBER" default="1.0"/>
<xs:attribute name="noTrueChildStrategy" type="NO-TRUE-CHILD-STRATEGY" default="returnNullPrediction"/>

I have searched online but haven't found any clues.

Really appreciate the help..

Thanks,
Raj

@vruusmann
Copy link
Member

Apache Spark ML decision tree models do not support such "execution flow" customizations. Therefore, it is impossible to generate the requested PMML markup automatically.

Possible workaround:

  1. Generate a "raw" PMML class model object by invoking ConverterUtil#toPMML(StructType, PipelineModel).
  2. Apply a (list of-) post-processing Visitor(s) to it.
  3. Save the "post-processed" PMML class model object to a file.

In your case, the Visitor needs to be modifying TreeModel elements (of the top-level MiningModel element). Here's a starting point:

PMML pmml = ConverterUtil.toPMML(...);

Visitor treeModelUpdater = new new AbstractVisitor(){

	@Override
	public VisitorAction visit(TreeModel treeModel){
		treeModel.setMissingValueStrategy(MissingValueStrategy.NULL_PREDICTION);
		treeModel.setMissingValuePenalty(0d);
		treeModel.setNoTrueChildStrategy(NoTrueChildStrategy.RETURN_NULL_PREDICTION);
	
		return VisitorAction.CONTINUE;
	}
};
treeModelUpdater.applyTo(pmml);

@vruusmann
Copy link
Member

Related to #14

It should be possible to toggle PMML converters between two modes:

  1. "Missing value"-friendly (a missing input leads to a missing prediction).
  2. "Missing value"-hostile (a missing input raises an error, or leads to a non-missing default prediction).

@vruusmann vruusmann changed the title How to invoke the in jpmml-sparkml Customizing the "missing value handling"-mode of models Jul 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants