-
Notifications
You must be signed in to change notification settings - Fork 547
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug with Open XML SDK v3.0 and excels with malformed hyperlinks #1636
Comments
We encountered the same problem and found that it comes from this line: Open-XML-SDK/src/DocumentFormat.OpenXml.Framework/Packaging/WriteableStreamExtensions.cs Line 24 in 15dafe1
Calling A workaround is to copy the stream to a MemoryStream first. But maybe this method should check |
I tried that, read it and then copied it to a memorysteam with |
I'd rather not have to copy to a memory stream first - let me take a look at the repro. Thanks for bringing it to our attention |
- We must be able to seek to create a temporary file requried for uri handling - We should throw the same exception we did in v2.x - If compat mode is on, this will throw at open time (i.e. it will eagerly load everything) Fixes #1636
Ok here's what I'm putting in for a v3.0.1 release:
If you want to enable the framework to handle readonly/non-seekable streams, the current set up won't work well for that. I've been working on a builder pattern to set up a pipeline of sorts for what should happen when you open a package (see #1541). Once we get v3.0.1 out, I'll enable it again on ci builds for testing via the ci and we can look at an opt-in mechanism for scenarios like this. |
@ilayna I couldn't repro the exact error your seeing, so the fix ups I'm pushing in may not actually fix this issue. Can you either verify once it's merged in or provide a repro I can try out? |
- We must be able to seek to create a temporary file requried for uri handling - We should throw the same exception we did in v2.x - If compat mode is on, this will throw at open time (i.e. it will eagerly load everything) Fixes #1636
@ilayna the fix is merged in and a build will be available within the hour on the CI feed (documented on the README). Can you take a look to see if this fixes your issue? |
OC, I'll check and get back to you. |
@twsouthwick
On the other hand, when I use
|
Okay, I'll try to come up with something, |
Thanks - I believe there is something else at play. I think I got fixated by #1636 (comment) when it was a different issue. Let me know when you have something - we'll be cutting a 3.0.1 release with the fixes so far, but will be able to do a 3.0.2 if this doesn't require public api changes. |
I've got the same problem after upgrading to version 3.0.1: When there is a malformed URI in the parsed excel file, I get "System.ObjectDisposedException: Cannot access a closed stream." Besides, let me share how I deal with this problem in my app implementation: First I open the provided file-stream in read-only mode and try to parse it, because that is the most memory-efficient way to handle it. In case that an excepation is thrown, due to a malformed URI, The file-stream is copied into a byte-buffer. Then, a writable stream is opened from the byte-buffer and the document is parsed again from the buffer. Due to the stream being writable, the malformed URIs can be corrected in the second attempt. Since malformed URIs don't occur in many documents, this approach reduces memory consumption by applying a byte-buffer only when necessary. I wish there was a better solution, but this at least works. |
This does appear similar to #1802 which I found a root cause for. There's been some work in this area - can someone check with a CI build of 3.2 or wait for the release of that (later this week)? |
Describe the bug
Getting the root element of a
WorkbookPart
of an excel with invalid hyperlinks causes the following exception:Opening the .xlsx file with excel, selecting all columns and clicking
remove hyperlinks
solves the problem.To Reproduce
Steps to reproduce the behavior:
Move the excel with the malformed hyperlinks to "Excels".
Run the following code.
See error.
Note: in practice, i tried it on many files, with some it worked others not so much...
with 3-4 files it seems to always happen with the same ones, when I tested with many many files I couldn't really point it to specific files.
Observed behavior
Threw an error.
Expected behavior
Not to.
Desktop (please complete the following information):
Additional context
When I ran the same program with OpenXml 2.2 it threw UriMalformed (or something like that) and crashed at
using var spreadSheetDocument = SpreadsheetDocument.Open(excelStream, false);
Opening the .xlsx file with excel, selecting all columns and clicking
remove hyperlinks
solves the problem.the hyperlinks are in the format of
som_e_thing_@LIK_th is
I can't attach the files I tested with, sorry.
Might be related to Add better support for malformed URIs #1322
The text was updated successfully, but these errors were encountered: