-
Notifications
You must be signed in to change notification settings - Fork 51
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BLOG: How scikit-lego became dataframe-agnostic using Narwhals #846
BLOG: How scikit-lego became dataframe-agnostic using Narwhals #846
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
d49912a
to
5135408
Compare
fe40102
to
e4a3548
Compare
Co-authored-by: Athan <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very impressive results there, in a nice short article!
Comments are a couple of spelling errors, a quibble, and one tentative sentence change.
Good stuff!!!
thanks @rec for your review, appreciate it! do you know why the CI job is failing? I've looked at the logs but they're pretty opaque to me |
…ght-website into scikit-lego-narwhals
To be honest, I'm a little shocked that adding a .md file would change anything in code execution. Perhaps it has to do with the changes to .vscode/settings.json, which seem out-of-place here? |
Well, that suggestion of mine didn't work. :-/ This message: I am unable to log in to vercel with my github account: could it be that a previous commit caused this error and you're blameless? Likely you need to find someone who knows that system.... |
|
||
## How does it work? | ||
|
||
Let's take a look at `sklego.pandas_utils.add_lags`. The code before version 0.9.0 did something like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's take a look at `sklego.pandas_utils.add_lags`. The code before version 0.9.0 did something like this: | |
Let's take a look at `sklego.pandas_utils.add_lags` as a tangible example that demonstrates how you might be able to leverage narwhals in your own code. The code before version 0.9.0 did something like this: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also reminds me, we probably want to rename that section of the API.
|
||
Furthermore, converting to pandas may present a cost - for example, if you start with a Polars | ||
LazyFrame, then you're required to call `.collect` on it before converting | ||
to pandas. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to pandas. | |
to pandas. Effectively, that would remove all the benefit of using a `LazyFrame` in the first place! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not totally sure about this one - I mean:
df = pl.scan_parquet(...).with_columns(...).filter(...).group_by(...).agg(...).collect().to_pandas()
scikit_lego_func(df)
would still be better than
df = pl.read_parquet(...).with_columns(...).filter(...).group_by(...).agg(...).to_pandas()
scikit_lego_func(df)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good to me, left some comments to consider tho.
apps/labs/public/posts/scikit-lego-narwhals/scikit_lego_narwhals_handshake.png
Outdated
Show resolved
Hide resolved
Thanks all for your comments! 🙏 I think I've addressed everything Any further comments, or do we want to ship this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice read. Thanks Marco for the post, and all reviewers for the review effort!
Any further comments, or do we want to ship this?
Looks like we're all good here, so I'll go ahead and hit the green button:)
(cherry picked from commit 82569df)
Link to the post for reference: https://labs.quansight.org/blog/scikit-lego-narwhals |
…#854) * Apply suggestions from code review * Apply suggestions from code review fixed typpos * HOTFIX pseudocode post pub date * Release 2023-04-11 (#718) * Update README to reflect making main the default branch * BLOG: How scikit-lego became dataframe-agnostic using Narwhals (#846) (cherry picked from commit 82569df) * chore(deps-dev): bump @graphql-codegen/typescript from 2.5.1 to 4.0.7 Bumps [@graphql-codegen/typescript](https://github.com/dotansimha/graphql-code-generator/tree/HEAD/packages/plugins/typescript/typescript) from 2.5.1 to 4.0.7. - [Release notes](https://github.com/dotansimha/graphql-code-generator/releases) - [Changelog](https://github.com/dotansimha/graphql-code-generator/blob/master/packages/plugins/typescript/typescript/CHANGELOG.md) - [Commits](https://github.com/dotansimha/graphql-code-generator/commits/@graphql-codegen/[email protected]/packages/plugins/typescript/typescript) --- updated-dependencies: - dependency-name: "@graphql-codegen/typescript" dependency-type: direct:development update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Brian Skinn <[email protected]> Co-authored-by: gabalafou <[email protected]> Co-authored-by: Noa Tamir <[email protected]> Co-authored-by: Pavithra Eswaramoorthy <[email protected]> Co-authored-by: Ralf Gommers <[email protected]> Co-authored-by: Marco Edward Gorelli <[email protected]> Co-authored-by: Tania Allard <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…Quansight#854) * Apply suggestions from code review * Apply suggestions from code review fixed typpos * HOTFIX pseudocode post pub date * Release 2023-04-11 (Quansight#718) * Update README to reflect making main the default branch * BLOG: How scikit-lego became dataframe-agnostic using Narwhals (Quansight#846) (cherry picked from commit 82569df) * chore(deps-dev): bump @graphql-codegen/typescript from 2.5.1 to 4.0.7 Bumps [@graphql-codegen/typescript](https://github.com/dotansimha/graphql-code-generator/tree/HEAD/packages/plugins/typescript/typescript) from 2.5.1 to 4.0.7. - [Release notes](https://github.com/dotansimha/graphql-code-generator/releases) - [Changelog](https://github.com/dotansimha/graphql-code-generator/blob/master/packages/plugins/typescript/typescript/CHANGELOG.md) - [Commits](https://github.com/dotansimha/graphql-code-generator/commits/@graphql-codegen/[email protected]/packages/plugins/typescript/typescript) --- updated-dependencies: - dependency-name: "@graphql-codegen/typescript" dependency-type: direct:development update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Brian Skinn <[email protected]> Co-authored-by: gabalafou <[email protected]> Co-authored-by: Noa Tamir <[email protected]> Co-authored-by: Pavithra Eswaramoorthy <[email protected]> Co-authored-by: Ralf Gommers <[email protected]> Co-authored-by: Marco Edward Gorelli <[email protected]> Co-authored-by: Tania Allard <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…Quansight#854) * Apply suggestions from code review * Apply suggestions from code review fixed typpos * HOTFIX pseudocode post pub date * Release 2023-04-11 (Quansight#718) * Update README to reflect making main the default branch * BLOG: How scikit-lego became dataframe-agnostic using Narwhals (Quansight#846) (cherry picked from commit 82569df) * chore(deps-dev): bump @graphql-codegen/typescript from 2.5.1 to 4.0.7 Bumps [@graphql-codegen/typescript](https://github.com/dotansimha/graphql-code-generator/tree/HEAD/packages/plugins/typescript/typescript) from 2.5.1 to 4.0.7. - [Release notes](https://github.com/dotansimha/graphql-code-generator/releases) - [Changelog](https://github.com/dotansimha/graphql-code-generator/blob/master/packages/plugins/typescript/typescript/CHANGELOG.md) - [Commits](https://github.com/dotansimha/graphql-code-generator/commits/@graphql-codegen/[email protected]/packages/plugins/typescript/typescript) --- updated-dependencies: - dependency-name: "@graphql-codegen/typescript" dependency-type: direct:development update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Brian Skinn <[email protected]> Co-authored-by: gabalafou <[email protected]> Co-authored-by: Noa Tamir <[email protected]> Co-authored-by: Pavithra Eswaramoorthy <[email protected]> Co-authored-by: Ralf Gommers <[email protected]> Co-authored-by: Marco Edward Gorelli <[email protected]> Co-authored-by: Tania Allard <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…Quansight#854) * Apply suggestions from code review * Apply suggestions from code review fixed typpos * HOTFIX pseudocode post pub date * Release 2023-04-11 (Quansight#718) * Update README to reflect making main the default branch * BLOG: How scikit-lego became dataframe-agnostic using Narwhals (Quansight#846) (cherry picked from commit 82569df) * chore(deps-dev): bump @graphql-codegen/typescript from 2.5.1 to 4.0.7 Bumps [@graphql-codegen/typescript](https://github.com/dotansimha/graphql-code-generator/tree/HEAD/packages/plugins/typescript/typescript) from 2.5.1 to 4.0.7. - [Release notes](https://github.com/dotansimha/graphql-code-generator/releases) - [Changelog](https://github.com/dotansimha/graphql-code-generator/blob/master/packages/plugins/typescript/typescript/CHANGELOG.md) - [Commits](https://github.com/dotansimha/graphql-code-generator/commits/@graphql-codegen/[email protected]/packages/plugins/typescript/typescript) --- updated-dependencies: - dependency-name: "@graphql-codegen/typescript" dependency-type: direct:development update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <[email protected]> --------- Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: Brian Skinn <[email protected]> Co-authored-by: gabalafou <[email protected]> Co-authored-by: Noa Tamir <[email protected]> Co-authored-by: Pavithra Eswaramoorthy <[email protected]> Co-authored-by: Ralf Gommers <[email protected]> Co-authored-by: Marco Edward Gorelli <[email protected]> Co-authored-by: Tania Allard <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Text styling
Non-text contents