-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
* #313: Enabled fixing differing versions of dependency scikit-learn * Added functionality to notebook sklearn_fix_version * Updated notebook test * Fixed sk_train to remove feature labels * [CodeBuild]
- Loading branch information
Showing
5 changed files
with
142 additions
and
6 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
132 changes: 132 additions & 0 deletions
132
...ds/sandbox/runtime/ansible/roles/jupyter/files/notebook/sklearn/sklearn_fix_version.ipynb
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,132 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"id": "289d2a8c-953d-46e5-8c73-ad810c29b20f", | ||
"metadata": {}, | ||
"source": [ | ||
"# Fix the Version of Python Library Scikit-learn\n", | ||
"\n", | ||
"This notebook ensures the AI-Lab is using the same version of the python library `scikit-learn` as the one used by the built-in [Script Language Container (SLC)](https://docs.exasol.com/db/latest/database_concepts/udf_scripts/adding_new_packages_script_languages.htm#ScriptLanguageContainer) inside the Exasol database.\n", | ||
"\n", | ||
"## Rationale\n", | ||
"\n", | ||
"Using identical versions is required when transferring the Scikit-learn model from the AI-Lab to the database SLC.\n", | ||
"\n", | ||
"The AI-Lab serializes the Scikit-learn model with [pickle](https://docs.python.org/3/library/pickle.html) and uploads it into the BucketFS of the database. The UDF using the built-in SLC can only _deserialize_ the model if it is using the same version of Scikit-learn as was used for serializing it. The specific version of the library available in the built-in SLC depends on the release version of the database and cannot be controlled by the AI-Lab.\n", | ||
"\n", | ||
"Running the following script will update the version of the library used in the AI-Lab, if required.\n", | ||
"\n", | ||
"## Open Secure Configuration Storage" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "d86ca808-044e-4fbd-be30-5ba8324f501e", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"%run ../utils/access_store_ui.ipynb\n", | ||
"display(get_access_store_ui('../'))" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "055ed302-69aa-426c-b5ec-861c63b82d33", | ||
"metadata": {}, | ||
"source": [ | ||
"## Detect the Version of Scikit-learn Used in the SLC\n", | ||
"\n", | ||
"The following cell creates a User Defined Function (UDF) called `detect_scikit_learn_version()` and then executes the UDF using the built-in SLC via an SQL statement.\n", | ||
"\n", | ||
"The UDF inquires and returns the version of Scikit-learn available in the built-in SLC which is then stored in variable `slc_scikit_learn_version`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "fa6c628f-853e-4850-8bab-46f7f645856e", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import textwrap\n", | ||
"from exasol.nb_connector.connections import open_pyexasol_connection\n", | ||
"\n", | ||
"sql = textwrap.dedent(\"\"\"\n", | ||
"CREATE OR REPLACE PYTHON3 SCALAR SCRIPT {schema!q}.detect_scikit_learn_version() RETURNS VARCHAR(100) AS\n", | ||
"import sklearn\n", | ||
"def run(ctx):\n", | ||
" return sklearn.__version__ \n", | ||
"/\n", | ||
"\"\"\")\n", | ||
"\n", | ||
"with open_pyexasol_connection(ai_lab_config, compression=True) as conn:\n", | ||
" query_params={'schema': ai_lab_config.db_schema}\n", | ||
" conn.execute(sql, query_params)\n", | ||
" result = conn.execute(\"select {schema!q}.detect_scikit_learn_version()\", query_params).fetchone()\n", | ||
" slc_scikit_learn_version = result[0]" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"id": "e4b0dc24-6e02-4305-8fa1-15f68afac360", | ||
"metadata": {}, | ||
"source": [ | ||
"## Compare the Scikit-learn Version and Update the AI-Lab if Required\n", | ||
"\n", | ||
"The next cell compares the Scikit-learn version returned by the UDF with the Scikit-learn version in the AI-Lab environment. If they differ, then the cell installs the UDF's Scikit-learn version in the AI-Lab environment." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "50b88871-4c37-4cc1-ac85-841a22e98153", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import sklearn\n", | ||
"from importlib import reload\n", | ||
"\n", | ||
"my_version = sklearn.__version__\n", | ||
"\n", | ||
"if slc_scikit_learn_version == my_version:\n", | ||
" print(f\"AI-Lab scikit-learn version {my_version} is identical to that of the SLC.\\nNothing to do.\")\n", | ||
"else:\n", | ||
" print(f\"AI-Lab scikit-learn version {my_version} differs from SLC.\\nInstalling version {slc_scikit_learn_version} ...\")\n", | ||
" %pip install \"scikit_learn=={slc_scikit_learn_version}\"\n", | ||
" sklearn = reload(sklearn)\n", | ||
" print(f\"Updated AI-Lab scikit-learn to version {sklearn.__version__}.\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"id": "b0ea6891-d171-4841-a2d8-edf8ac252d86", | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "Python 3 (ipykernel)", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.10.12" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 5 | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters