-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Find indexes of item in list #20812
Comments
The output of the proposed name_value_map = {
"one": 1,
"seventeen": 17,
"null": None,
"fiftyfive": 55
}
df = pl.DataFrame({"a": [1, None, 17, 1]})
(
df
.with_row_index()
.select(
[
pl.col("index")
.filter(
# Option A: with `index_of`
pl.col("a").index_of(value).over("index").is_not_null()
# Option B: without `index_of`
(pl.col("a") == value) |
(pl.col("a").is_null() & (value is None))
)
.implode()
.alias(name)
for name, value in name_value_map.items()
]
)
) |
There is also df.select(
pl.col("a").eq_missing(value).arg_true().implode()
.alias(name)
for name, value in name_value_map.items()
)
# shape: (1, 4)
# ┌───────────┬───────────┬───────────┬───────────┐
# │ one ┆ seventeen ┆ null ┆ fiftyfive │
# │ --- ┆ --- ┆ --- ┆ --- │
# │ list[u32] ┆ list[u32] ┆ list[u32] ┆ list[u32] │
# ╞═══════════╪═══════════╪═══════════╪═══════════╡
# │ [0, 3] ┆ [2] ┆ [1] ┆ [] │
# └───────────┴───────────┴───────────┴───────────┘ |
Thanks @cmdlineluser, would you say that using the compound expression should be favored over adding a dedicated expression like The result of df.select(
pl.col("a").eq_missing(value).arg_true().first()
.alias(name)
for name, value in name_value_map.items()
) If there is no intention to add an expression to return all the indexes of an item in a column/list then I would suggest to:
|
Description
In #19894
index_of
was introduced, which gets the index of the first occurrence of an item in a column/list when available. This is a very helpful feature, but why only return the first occurrence and not the firstn
or all of them?Could
index_of
be changed toindexes_of
where it returns all the occurrences of the item in the list and an empty list when the item is not found? To get back the current behavior (minus the return type) an optional keyword argumentn
= number of indexes to return could be added to this expression or toindexes_of_exact
(like insplit_exact
).Expanding on the example in the documentation for
index_of
we would get the following usage:The text was updated successfully, but these errors were encountered: