-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#313 enabled fix differing versions of dependency scikit learn #316
#313 enabled fix differing versions of dependency scikit learn #316
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
During nbtests in CI we observed the error message
|
[CodeBuild]
@@ -0,0 +1,108 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The specific version of the library used by the default SLC depends ...
"AI-Lab adapts its own version of the library if required." I don't think it's a good explanation of what is happening here. AI-Lab itself is not doing anything. The user should run the script like the one in this notebook to make sure the versions are aligned.
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the paragraph in the next push.
Please review again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,108 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you split this code cell in two parts and add markdown cells for each part explaining the code?
In the first part we create a UDF that reads the version of the scikit_learn library. The UDF runs inside the language container, therefore the library version detected by the UDF is the version installed in this container.
In the second part we compare the version returned by the UDF with the version in the AI-Lab environment. If they differ we install the UDF's version in the AI-Lab environment.
BTW, do we need to run pip with the --upgrade option?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, see next push.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, do we need to run pip with the --upgrade option?
That I cannot answer. My tests have been successful without this option.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
@@ -91,7 +91,7 @@ | |||
"\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line #12. model.fit(X_train.values, y_train)
Why? The model is happy to take a DataFrame as an input.
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We observed an error message as specified in comment-2293085752.
The notebook sklearn_train_abalone.ipynb did use .values
already.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe earlier versions of scikit-learn didn't care about feature names. It seems that when we read input in udf like ctx.get_dataframe(num_rows=1000, start_col=1)
we do not get the column names in the pandas DataFrame. That's strange.
Closes #313