Skip to content

Commit

Permalink
feat: replace other values than NaN with imputer (#707)
Browse files Browse the repository at this point in the history
Closes #643

### Summary of Changes

* Add an optional argument to `Imputer` to configure the
`value_to_replace`. This can be an int, float, or string.

---------

Co-authored-by: megalinter-bot <[email protected]>
  • Loading branch information
lars-reimann and megalinter-bot authored May 3, 2024
1 parent 36e4a7a commit 4a059e0
Show file tree
Hide file tree
Showing 5 changed files with 359 additions and 288 deletions.
72 changes: 36 additions & 36 deletions docs/tutorials/data_processing.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -32,15 +32,15 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"from safeds.data.tabular.containers import Table\n",
"\n",
"titanic = Table.from_csv_file(\"data/titanic.csv\")"
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -54,15 +54,15 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"titanic_slice = titanic.slice_rows(end=10)\n",
"\n",
"titanic_slice # just to show the output"
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -76,13 +76,13 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"titanic_slice.get_row(0)"
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -96,13 +96,13 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"titanic_slice.get_column(\"name\")"
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -116,7 +116,6 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"Table.from_rows([\n",
" titanic_slice.get_row(0),\n",
Expand All @@ -125,7 +124,8 @@
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -139,7 +139,6 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"Table.from_columns([\n",
" titanic_slice.get_column(\"name\"),\n",
Expand All @@ -148,7 +147,8 @@
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -162,7 +162,6 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"titanic_slice.remove_columns([\n",
" \"id\",\n",
Expand All @@ -175,7 +174,8 @@
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -189,13 +189,13 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"titanic_slice.keep_only_columns([\"name\", \"survived\"])"
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -211,13 +211,13 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"titanic_slice.sort_columns()"
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -231,7 +231,6 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"titanic_slice.sort_columns(\n",
" lambda column1, column2:\n",
Expand All @@ -240,7 +239,8 @@
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -254,7 +254,6 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"titanic.filter_rows(\n",
" lambda row:\n",
Expand All @@ -263,7 +262,8 @@
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -278,7 +278,6 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"from safeds.data.tabular.transformation import Imputer\n",
"\n",
Expand All @@ -287,7 +286,8 @@
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -301,7 +301,6 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"from safeds.data.tabular.transformation import LabelEncoder\n",
"\n",
Expand All @@ -310,7 +309,8 @@
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -324,7 +324,6 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"from safeds.data.tabular.transformation import OneHotEncoder\n",
"\n",
Expand All @@ -333,7 +332,8 @@
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -347,7 +347,6 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"from safeds.data.tabular.transformation import RangeScaler\n",
"\n",
Expand All @@ -356,7 +355,8 @@
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -370,7 +370,6 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"from safeds.data.tabular.transformation import StandardScaler\n",
"\n",
Expand All @@ -379,7 +378,8 @@
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -394,13 +394,13 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"titanic_slice.transform_column(\"sex\", lambda row: 1 if row.get_value(\"sex\") == \"female\" else 0)\n"
],
"metadata": {
"collapsed": false
}
},
"outputs": []
},
{
"cell_type": "markdown",
Expand All @@ -414,13 +414,13 @@
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"titanic_slice.transform_column(\"parents_children\", lambda row: \"No\" if row.get_value(\"parents_children\") == 0 else \"Yes\")\n"
],
"metadata": {
"collapsed": false
}
},
"outputs": []
}
],
"metadata": {
Expand Down
Loading

0 comments on commit 4a059e0

Please sign in to comment.