Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XlsxWriter Roadmap v2 #1028

Open
jmcnamara opened this issue Nov 3, 2023 · 8 comments
Open

XlsxWriter Roadmap v2 #1028

jmcnamara opened this issue Nov 3, 2023 · 8 comments
Labels

Comments

@jmcnamara
Copy link
Owner

jmcnamara commented Nov 3, 2023

Previous roadmap

XlsxWriter is almost 10 years old. The first version was released was in February 17 2013. According to pypinfo it has around 12 million monthly downloads so it is probably fair to say that it has been useful.

Recently I have been porting/rewriting XlsxWriter in Rust and it has been an interesting experience. When I'm finished with the Rust port, sometime near the end of 2024, I'd like to revisit XlsxWriter and bring it up to date with modern Python and practice. Some ideas:

  • Type annotation with documentation on all (200) public APIs.
  • Better error handling and reporting. Also better API argument testing with the help of the type annotations.
  • More exceptions.
  • A Image type like in the Rust version. This would make improvements like a worksheet.insert_image_fit_to_cell() method easier to implement.
  • A Color type like in the Rust version which will allow a uniform implementation of theme colors (often requested).
  • Maybe a Formula type like in the Rust version.
  • Better autofit().
  • More modular structuring of the Chart internals to allow more formatting.
  • Support for separation of cell data and formatting like in the Rust version.
  • Other cleanups and refactoring.
@max-muoto
Copy link

@jmcnamara Would you consider having the next version just rely on the Rust version through pyo3 to realize performance benefits for the Python interface?

@jmcnamara
Copy link
Owner Author

jmcnamara commented Jan 25, 2024

Would you consider having the next version just rely on the Rust version through pyo3 to realize performance benefits for the Python interface?

@max-muoto

I don't think that would be practical from a maintenance point of view or desirable from an end user point of view. At the moment the Python version has zero dependencies and more functionality than the Rust version.

However, I would see scope for a "lite" version of XlsxWriter + pyo3 with support for just writing data and formatting. Something that could be consumed by Pandas, for example, to speed up file writing. From rough initial benchmarks that could be about 8x faster than the pure Python version. I see that Pandas recently adopted a Rust backed xlsx reader based on Calamine so they might be open to a similar writer. I'll keep it in mind.

@max-muoto
Copy link

Would you consider having the next version just rely on the Rust version through pyo3 to realize performance benefits for the Python interface?

@max-muoto

I don't think that would be practical from a maintenance point of view or desirable from an end user point of view. At the moment the Python version has zero dependencies and more functionality than the Rust version.

However, I would see scope for a "lite" version of XlsxWriter + pyo3 with support for just writing data and formatting. Something that could be consumed by Pandas, for example, to speed up file writing. From rough initial benchmarks that could be about 8x faster than the pure Python version. I see that Pandas recently adopted a Rust backed xlsx reader based on Calamine so they might be open to a similar writer. I'll keep it in mind.

Makes sense, thanks for the info!

I think a minimal version for compatibility with Polars/Pandas would be great. Polars also recently added support for Calamine as a reader, so I feel this is something that might be pretty open to as well.

@jmcnamara
Copy link
Owner Author

Polars also recently added support for Calamine as a reader, so I feel this is something that might be pretty open to as well.

That is good to know.

I think a minimal version for compatibility with Polars/Pandas would be great.

Polars could take the Rust version directly. I wrote polars_excel_writer as a prototype for that and there has been some initial engagement with the Polars folks here.

For Pandas I started a PYO3 wrapper called xlsxwriter_lite. However, that is currently very rudimentary.

@alexander-beedie
Copy link
Contributor

alexander-beedie commented Feb 28, 2024

Polars could take the Rust version directly. I wrote polars_excel_writer as a prototype for that and there has been some initial engagement with the Polars folks here.

We've been thinking about taking calamine as a direct Polars (Rust) dependency to squeeze every last possible drop of speed out of it; if/when we get around to that it might be time to revisit the writing side (though unless somebody suddenly gets a lot of unexpected free time this might take a while 😅)

@jkyeung
Copy link
Contributor

jkyeung commented May 29, 2024

Would you consider having the next version just rely on the Rust version through pyo3 to realize performance benefits for the Python interface?

I don't think that would be practical from a maintenance point of view or desirable from an end user point of view. At the moment the Python version has zero dependencies and more functionality than the Rust version.

Eventually, the Rust version may have equal or greater functionality than the Python version.

But I fully agree with @jmcnamara that having zero dependencies is desirable. In fact, it is a lifeline for those of us who want to use the full capabilities of XlsxWriter on systems with Python but no support for Rust. Even for systems that do support Rust, there will be some users who find the pure-Python XlsxWriter fast enough for their needs and would rather not introduce extra downloads or dependencies.

@ijustlovemath
Copy link

I just wanted to add a +1 for type annotations! Following this issue with great interest ^^

@jmcnamara
Copy link
Owner Author

As part of the XlsxWriter 2 refactoring I'm looking for comments on: Refactoring of return values to Enums or Exceptions #1100

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants