You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Unfortunaltely, the normalization is not available in sklearn2pmml and the results are not good enough without it.
So I am thinking about export a PMML for the TF-IDF part, normalize the results and then export another PMML for the classification part. The Normalization part would be written in Java and implemented between our 2 PMMLs.
But I cannot use a PMMLPipeline with a TfidfVectorizer transformer only. With this code :
Exception in thread "main" java.lang.IllegalArgumentException: Tuple contains an unsupported value (Python class sklearn.preprocessing.data.Normalizer)
at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:43)
at com.google.common.collect.Lists$TransformingRandomAccessList$1.transform(Lists.java:616)
at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:47)
at sklearn.pipeline.Pipeline.encodeFeatures(Pipeline.java:68)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:202)
at org.jpmml.sklearn.Main.run(Main.java:145)
at org.jpmml.sklearn.Main.main(Main.java:94)
Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDict to sklearn.Transformer
at java.lang.Class.cast(Class.java:3369)
at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:41)
... 6 more
Is there a way to export a TfidfVectorizer transformer in a PMML ? Or implement a l2 normalization in PMML pipeline ?
The text was updated successfully, but these errors were encountered:
Exception in thread "main" java.lang.IllegalArgumentException: Tuple contains an unsupported value (Python class sklearn.preprocessing.data.Normalizer)
A pipeline has to end with an estimator, not a transformer.
**Hi,
I am trying to run this code to vectorize some text, then classify it with a logistic regression :**
from sklearn.linear_model import LogisticRegression
pipeline = PMMLPipeline([
("tfidf", TfidfVectorizer(
norm = None,
ngram_range=(1,2),
# min_df=5,
max_df=0.5,
analyzer = "word",
max_features=1000,
token_pattern = None,
tokenizer = Splitter()))
])
Unfortunaltely, the normalization is not available in sklearn2pmml and the results are not good enough without it.
So I am thinking about export a PMML for the TF-IDF part, normalize the results and then export another PMML for the classification part. The Normalization part would be written in Java and implemented between our 2 PMMLs.
But I cannot use a PMMLPipeline with a TfidfVectorizer transformer only. With this code :
pipeline = PMMLPipeline([
("tfidf", TfidfVectorizer(
norm = None,
ngram_range=(1,2),
# min_df=5,
max_df=0.5,
analyzer = "word",
max_features=1000,
token_pattern = None,
tokenizer = Splitter()))
])
model = pipeline.fit(x_train)
sklearn2pmml(model,"model_text7.pmml", with_repr = True, debug = True)
I got this error message :
Exception in thread "main" java.lang.IllegalArgumentException: Tuple contains an unsupported value (Python class sklearn.preprocessing.data.Normalizer)
at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:43)
at com.google.common.collect.Lists$TransformingRandomAccessList$1.transform(Lists.java:616)
at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:47)
at sklearn.pipeline.Pipeline.encodeFeatures(Pipeline.java:68)
at sklearn2pmml.pipeline.PMMLPipeline.encodePMML(PMMLPipeline.java:202)
at org.jpmml.sklearn.Main.run(Main.java:145)
at org.jpmml.sklearn.Main.main(Main.java:94)
Caused by: java.lang.ClassCastException: Cannot cast net.razorvine.pickle.objects.ClassDict to sklearn.Transformer
at java.lang.Class.cast(Class.java:3369)
at org.jpmml.sklearn.CastFunction.apply(CastFunction.java:41)
... 6 more
Is there a way to export a TfidfVectorizer transformer in a PMML ? Or implement a l2 normalization in PMML pipeline ?
The text was updated successfully, but these errors were encountered: