Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLOG: How scikit-lego became dataframe-agnostic using Narwhals #846

Merged
merged 26 commits into from
Jun 17, 2024

Conversation

MarcoGorelli
Copy link
Contributor

Text styling

  • The blog is written with plain language (where relevant).
  • If there are headers, they use the proper header tags in order to do so (with only one level-one header).
  • All links describe where they link to (for example, check the Quansight labs website).
  • Any kind of styling that the author uses (for example, bold for emphasis) is consistent throughout the blog.

Non-text contents

  • Blog post featured image is in PNG or JPEG format, not SVG.
  • All content is represented as text (for example, images need alt text and videos need captions or descriptive transcripts).
  • If there are emojis, there are not more than three in a row.
  • Don't use flashing gifs or videos.
  • If it were to be read as plain text, the blog still makes sense and no information is missing.

Copy link

vercel bot commented May 26, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
labs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Jun 17, 2024 6:53pm

Copy link

@rec rec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very impressive results there, in a nice short article!

Comments are a couple of spelling errors, a quibble, and one tentative sentence change.

Good stuff!!!

apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
@MarcoGorelli
Copy link
Contributor Author

thanks @rec for your review, appreciate it! do you know why the CI job is failing? I've looked at the logs but they're pretty opaque to me

@rec
Copy link

rec commented May 27, 2024

To be honest, I'm a little shocked that adding a .md file would change anything in code execution.

Perhaps it has to do with the changes to .vscode/settings.json, which seem out-of-place here?

@rec
Copy link

rec commented May 27, 2024

Well, that suggestion of mine didn't work. :-/

This message: Warning: data for page "/blog/[slug]" (path "/blog/numpy-string-ufuncs") is 252 kB which exceeds the threshold of 128 kB, this amount of data can reduce performance. is baffling because you don't even mention that indirectly as far as I can see. (It's a warning but that triggers an error.)

I am unable to log in to vercel with my github account: could it be that a previous commit caused this error and you're blameless?

Likely you need to find someone who knows that system....


## How does it work?

Let's take a look at `sklego.pandas_utils.add_lags`. The code before version 0.9.0 did something like this:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Let's take a look at `sklego.pandas_utils.add_lags`. The code before version 0.9.0 did something like this:
Let's take a look at `sklego.pandas_utils.add_lags` as a tangible example that demonstrates how you might be able to leverage narwhals in your own code. The code before version 0.9.0 did something like this:

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also reminds me, we probably want to rename that section of the API.


Furthermore, converting to pandas may present a cost - for example, if you start with a Polars
LazyFrame, then you're required to call `.collect` on it before converting
to pandas.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
to pandas.
to pandas. Effectively, that would remove all the benefit of using a `LazyFrame` in the first place!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not totally sure about this one - I mean:

df = pl.scan_parquet(...).with_columns(...).filter(...).group_by(...).agg(...).collect().to_pandas()
scikit_lego_func(df)

would still be better than

df = pl.read_parquet(...).with_columns(...).filter(...).group_by(...).agg(...).to_pandas()
scikit_lego_func(df)

Copy link

@koaning koaning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me, left some comments to consider tho.

@MarcoGorelli MarcoGorelli added the labs 🔭 Items related to the Labs website label May 27, 2024
apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
@MarcoGorelli
Copy link
Contributor Author

Thanks all for your comments! 🙏

I think I've addressed everything

Any further comments, or do we want to ship this?

Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice read. Thanks Marco for the post, and all reviewers for the review effort!

Any further comments, or do we want to ship this?

Looks like we're all good here, so I'll go ahead and hit the green button:)

apps/labs/posts/scikit-lego-narwhals.md Outdated Show resolved Hide resolved
@rgommers rgommers merged commit 82569df into Quansight:develop Jun 17, 2024
2 checks passed
rgommers pushed a commit that referenced this pull request Jun 17, 2024
@rgommers rgommers mentioned this pull request Jun 17, 2024
@rgommers
Copy link
Member

Link to the post for reference: https://labs.quansight.org/blog/scikit-lego-narwhals

gabalafou added a commit that referenced this pull request Jun 18, 2024
…#854)

* Apply suggestions from code review

* Apply suggestions from code review

fixed typpos

* HOTFIX pseudocode post pub date

* Release 2023-04-11 (#718)

* Update README to reflect making main the default branch

* BLOG: How scikit-lego became dataframe-agnostic using Narwhals (#846)

(cherry picked from commit 82569df)

* chore(deps-dev): bump @graphql-codegen/typescript from 2.5.1 to 4.0.7

Bumps [@graphql-codegen/typescript](https://github.com/dotansimha/graphql-code-generator/tree/HEAD/packages/plugins/typescript/typescript) from 2.5.1 to 4.0.7.
- [Release notes](https://github.com/dotansimha/graphql-code-generator/releases)
- [Changelog](https://github.com/dotansimha/graphql-code-generator/blob/master/packages/plugins/typescript/typescript/CHANGELOG.md)
- [Commits](https://github.com/dotansimha/graphql-code-generator/commits/@graphql-codegen/[email protected]/packages/plugins/typescript/typescript)

---
updated-dependencies:
- dependency-name: "@graphql-codegen/typescript"
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brian Skinn <[email protected]>
Co-authored-by: gabalafou <[email protected]>
Co-authored-by: Noa Tamir <[email protected]>
Co-authored-by: Pavithra Eswaramoorthy <[email protected]>
Co-authored-by: Ralf Gommers <[email protected]>
Co-authored-by: Marco Edward Gorelli <[email protected]>
Co-authored-by: Tania Allard <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
melissawm pushed a commit to melissawm/Quansight-website that referenced this pull request Jul 23, 2024
…Quansight#854)

* Apply suggestions from code review

* Apply suggestions from code review

fixed typpos

* HOTFIX pseudocode post pub date

* Release 2023-04-11 (Quansight#718)

* Update README to reflect making main the default branch

* BLOG: How scikit-lego became dataframe-agnostic using Narwhals (Quansight#846)

(cherry picked from commit 82569df)

* chore(deps-dev): bump @graphql-codegen/typescript from 2.5.1 to 4.0.7

Bumps [@graphql-codegen/typescript](https://github.com/dotansimha/graphql-code-generator/tree/HEAD/packages/plugins/typescript/typescript) from 2.5.1 to 4.0.7.
- [Release notes](https://github.com/dotansimha/graphql-code-generator/releases)
- [Changelog](https://github.com/dotansimha/graphql-code-generator/blob/master/packages/plugins/typescript/typescript/CHANGELOG.md)
- [Commits](https://github.com/dotansimha/graphql-code-generator/commits/@graphql-codegen/[email protected]/packages/plugins/typescript/typescript)

---
updated-dependencies:
- dependency-name: "@graphql-codegen/typescript"
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brian Skinn <[email protected]>
Co-authored-by: gabalafou <[email protected]>
Co-authored-by: Noa Tamir <[email protected]>
Co-authored-by: Pavithra Eswaramoorthy <[email protected]>
Co-authored-by: Ralf Gommers <[email protected]>
Co-authored-by: Marco Edward Gorelli <[email protected]>
Co-authored-by: Tania Allard <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
gabalafou added a commit to isabela-pf/Quansight-website that referenced this pull request Sep 4, 2024
…Quansight#854)

* Apply suggestions from code review

* Apply suggestions from code review

fixed typpos

* HOTFIX pseudocode post pub date

* Release 2023-04-11 (Quansight#718)

* Update README to reflect making main the default branch

* BLOG: How scikit-lego became dataframe-agnostic using Narwhals (Quansight#846)

(cherry picked from commit 82569df)

* chore(deps-dev): bump @graphql-codegen/typescript from 2.5.1 to 4.0.7

Bumps [@graphql-codegen/typescript](https://github.com/dotansimha/graphql-code-generator/tree/HEAD/packages/plugins/typescript/typescript) from 2.5.1 to 4.0.7.
- [Release notes](https://github.com/dotansimha/graphql-code-generator/releases)
- [Changelog](https://github.com/dotansimha/graphql-code-generator/blob/master/packages/plugins/typescript/typescript/CHANGELOG.md)
- [Commits](https://github.com/dotansimha/graphql-code-generator/commits/@graphql-codegen/[email protected]/packages/plugins/typescript/typescript)

---
updated-dependencies:
- dependency-name: "@graphql-codegen/typescript"
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brian Skinn <[email protected]>
Co-authored-by: gabalafou <[email protected]>
Co-authored-by: Noa Tamir <[email protected]>
Co-authored-by: Pavithra Eswaramoorthy <[email protected]>
Co-authored-by: Ralf Gommers <[email protected]>
Co-authored-by: Marco Edward Gorelli <[email protected]>
Co-authored-by: Tania Allard <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
gabalafou added a commit to isabela-pf/Quansight-website that referenced this pull request Sep 4, 2024
…Quansight#854)

* Apply suggestions from code review

* Apply suggestions from code review

fixed typpos

* HOTFIX pseudocode post pub date

* Release 2023-04-11 (Quansight#718)

* Update README to reflect making main the default branch

* BLOG: How scikit-lego became dataframe-agnostic using Narwhals (Quansight#846)

(cherry picked from commit 82569df)

* chore(deps-dev): bump @graphql-codegen/typescript from 2.5.1 to 4.0.7

Bumps [@graphql-codegen/typescript](https://github.com/dotansimha/graphql-code-generator/tree/HEAD/packages/plugins/typescript/typescript) from 2.5.1 to 4.0.7.
- [Release notes](https://github.com/dotansimha/graphql-code-generator/releases)
- [Changelog](https://github.com/dotansimha/graphql-code-generator/blob/master/packages/plugins/typescript/typescript/CHANGELOG.md)
- [Commits](https://github.com/dotansimha/graphql-code-generator/commits/@graphql-codegen/[email protected]/packages/plugins/typescript/typescript)

---
updated-dependencies:
- dependency-name: "@graphql-codegen/typescript"
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: Brian Skinn <[email protected]>
Co-authored-by: gabalafou <[email protected]>
Co-authored-by: Noa Tamir <[email protected]>
Co-authored-by: Pavithra Eswaramoorthy <[email protected]>
Co-authored-by: Ralf Gommers <[email protected]>
Co-authored-by: Marco Edward Gorelli <[email protected]>
Co-authored-by: Tania Allard <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
labs 🔭 Items related to the Labs website type: content 📝
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants