-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting sklearn.feature_extraction.text.TfidfVectorizer
#4
Comments
This is a major change, because it requires updating the core JPMML-Evaluator library also. The difficult part is class Earlier this week, I did take a deeper look into |
Can you provide an example script that would capture the intended use of There's example usage in SkLearn documentation, but I'm interested in finding out the exact details of your use case. For demonstration purposes, you could replace your real text corpus with some demo text corpus (publicly accessible in the internet) though. |
I'm currently working almost exclusively with text datasets, so this functionality would indeed be really helpful. And I'm pretty sure this is something other people would appreciate. Regarding the script, I think you can consider the example given in the sklearn documentation http://scikit-learn.org/stable/modules/feature_extraction.html#text-feature-extraction, since in my case I'm just using the default parameters of Thank you very much |
Hello @vruusmann. Is there any news regarding this issue? Thank you. |
I too am interested if there is any progress on this front. |
Fixed in commit 65636a7 |
It would be great if the transformation
sklearn.feature_extraction.text.TfidfVectorizer
can be supported by JPMML-sklearn. It would be even better if bothsklearn.feature_extraction.text.CountVectorizer
andsklearn.feature_extraction.text.TfidfTransformer
can be supported (TfidfVectorizer
is the combination of these two)The text was updated successfully, but these errors were encountered: