Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to 0.104.1 #368

Open
wants to merge 20 commits into
base: develop
Choose a base branch
from
Open

Update to 0.104.1 #368

wants to merge 20 commits into from

Conversation

jahav
Copy link
Member

@jahav jahav commented Oct 11, 2024

I am updating ClosedXML.Report because 0.104.1 was released.

A lot of unit tests is failing, mostly because I modified how defined names are working and I am slowly moving away from automagic that shifts everythign based on ranges.

I will fix tests one by one and familiarize myself with the codebase.

jahav added 4 commits October 11, 2024 13:45
This is the first commit, so I can try to fix the rest. It compiles and test fail.
The test failed, because some subtotals didn't have correct address.

The root cause was the ShiftFormula and ExpandFormula methods. They
adjusted only references in a range form ((e.g. `SUBTOTAL(9,I41:I61)`),
not single cell form (e.g. `SUBTOTAL(9, I41)`).

The condition was added in Initial commit 7 years ago, so it likely
isn't needed anymore (regex seems to find references). By removing the
redundant condition, test passes.
The test was failing for two reasons:

The template contained a <definedName name="Table1Range">
'Sheet 1'!#REF!</definedName> and when ClosedXML.Report tried to shift
the range, a range is shiften, it failed because setter
XLDefinedName.ReferesTo tried to extract references from that formula
(=no valid references). That was propagated to
FormulaReferences.ForFormula where parser reported an empty formula
error.

That is a ClosedXML error, XLDefinedName.RefersTo setter shouldn't try
to get references for empty formula.

The second problem was in the Gauges file. It contained formulas that
had area form (SUBTOTAL(9,I41:I41)) depsite being single cell
(SUBTOTAL(9,I41:I41)). Formulas thus weren't equal and test failed.
Caused by ClosedXML bug. When a sheet is copied (in this case to temp
sheet), it doesn't add links to calculation chain. But then, when it
tries to insert rows, formulas are shifted.

When XLRangeBase.InsertRowsAboveInternal is inserting rows, it also
shifts formulas of cells. It calls FormulaSlice.Set with shiufted
formula. The FormulaSlice then removes calc chain link for original
formula and adds for new formula. The problem is that it throws when
old formula doesn't have a link (and it doesn't because of copy).

ClosedXML should add links to calculation chain when sheet is copied,
but I am not sure. Should we even throws on missing link anyway?
ClosedXML will get a lot of files without correct calc chain.
@rekosko
Copy link

rekosko commented Nov 26, 2024

Any updates on this? :) I have some export performance issues when one of the cells has a specific formula that concatenate text and I'm hoping that with newest version this issues could be resolved.

@jahav
Copy link
Member Author

jahav commented Nov 26, 2024

0.104.1 is unfeasible. There are few bugs that need to be fixed (desc in commits). Still hoping to have it fixed by 0.105.

The root cause was that GroupTags (inherits SortTag) sorted the option
row at the beginning instead of the end. The option tag contained empty
strings.

It was fixed by changing two places:
* When tag is applied to template cells and the result is empty string,
  use blank instead.
* SortTag uses ignoreBlanks: true, which treats blanks as blanks. Blanks
  in sorting are always sorted at the end (regardless of asc/desc).

It's highly likely SortTags (`context.Range.Sort()`) shouldn't sort range
*including* the last option tag row. I am not familiar enough with the
codebase to do that change yet.

The Subranges_Multiple.xlsx was fixed to change of single cell areas for
subtotal (e.g.`SUBTOTAL(9,D1:D1)`) to single cell (e.g.`SUBTOTAL(9,D1)`).
The bug in SubrangesTests.MultipleSubRanges was caused by setting a value
of a cell to string.Empty to option tag row that was later sorted as the
first row instead of last row.

This reduces that chance by parsing tags only from string values. Other
types can never contains <<tag>> text anyway, so it's fine.
The original output differed in the style charset attribute of cells G13
C15 G15. The original had charset Default, while new one has charset
Russian.

The charset attribute has included in the ClosedXML style recently, so I
am just rewriting it. It's not even visible in the GUI.
…with_delete.xlsx)

Both tests had same two problems:
* A lot fo cells was maked as different, because in the gauge, they had
  default charset while the output had russina charset.
* The cell G24 was marked as unlocked, but in gauge it was locked

1. Style font char set difference
Charset was added to the Font in 0.104, so the equality comparer didn't
check it before, but does now. Therefore it is safe to ignore. Charset is
auxiliry property for a selection of a fallback font anyway.

2. G24 protection lock vs unlock

The problem was that cell G24 was locked in 0.102.2 and wasn't locked in
0.104.2. The root cause was also charset difference in a style of the
cell.

The ProtectedTag class first unlocks all used cells through

`ws.Cells().ForEach(c => { c.Style.Protection.Locked = false; });`

The ws.Cells() in returns only used cells. That is cells that don't have
value and have same style as a worksheet (relevant properties for this
case). That was true for the cell G24 in 0.102.2, because it ignored
charset property of a style. The cell G24 was thus evaluated as unused
and lock wasn't changed (although it was in fact locked, but so was sheet
- no difference in sheet vs cell style).

The 0.104.2 does however take into account a charset of a style. It found
that cell contains Default charset while sheet style contains Russian
charset. Therefore style of a cell was different and thus cell wasn't
considered empty.

ProtectedTag unlocks the non-empty cells (including G24) and that was the
root reason of discrepancy.

The gauge workbook can therefore be safely replaced by generated one.
@jahav
Copy link
Member Author

jahav commented Dec 11, 2024

Turns out value comparison check that compared two sheet values hasn't worked... for a while. It tried to get a field value through reflection and field has been gone for years.

I had pretty good grasp on what needed to be done within scope of 0.105, but now I got another 20 failing tests.

if ([...] actualCell.GetInnerText() != expectedCell.GetInnerText())
{
    messages.Add($"Cell values are not equal starting from {address}");
    cellsAreEqual = false;
}

internal static string GetInnerText(this IXLCell cell)
{
    _xlCellInnerText ??= cell
        .GetType()
        .GetProperty("InnerText", BindingFlags.Instance | BindingFlags.Public);

    return (string)_xlCellInnerText?.GetValue(cell, null);
}

The original one didn't fail on different values. Two cells could have
different values and it would declare them equal.

The property to compare value has been gone for a long time and thus
reflection always returned null. null is equal to null.
@XuJin186
Copy link

Any updates on this? :)

@XuJin186
Copy link

XuJin186 commented Jan 6, 2025

@jahav Do you have any plans to continue maintenance? If not, I can participate.

jahav added 6 commits January 6, 2025 02:05
The charset attribute has been added tot he font and it causes problems
with not-equality. Removing it in both template and gauge. Charset is basically
only a hint which font should be selected when original one is not found.
The test file sorts on a colkumn with multiple same values. The sorting algorithm
is now slightly different so the rows with same sort value are in different place.

Other than that, result is correct.
Pre-0.103, the range sorting wasn't stable (it was a custom quick-sort). Post 0.103, the range sorting is stable (as it should be, just like Excel plus it makes sense). The sorting is used when values are separated into groups. Originally, the order of items wasn't kept, but now the order of items is kept.

The item groups in the test files are generallt orders and they have PK order id. As such, they are loaded from db in sequential order. Therefore the item groups should keep orderNo (PK) in ascending order,because fo stabel sort. The orderNo is *not* actually sorted, but because it was loaded from db in ascending order (thanks to PK), it looks sorted.

Also remove charset from font definition in styles.xml.

* GroupTagTests_DisableOutline.xlsx - sort orders in each group, remove charset from font
* GroupTagTests_FormulasInGroupRow.xlsx - sort orders in each group, remove charset from font, fix inconsistent border
* GroupTagTests_MergeLabels.xlsx - sort orders in each group, remove charset from font
* GroupTagTests_MergeLabels2.xlsx - sort orders in each group, remove charset from font
* GroupTagTests_PlaceToColumn.xlsx - sort orders in each group, remove charset from font
* GroupTagTests_SummaryAbove.xlsx - sort orders in each group, remove charset from font
* GroupTagTests_SummaryAbove.xlsx - sort orders in each group, remove charset from font, border didn't match in merged regions => check that matches (ignore border) and then replace original file.
* GroupTagTests_NestedGroups.xlsx - sort orders in each group, remove charset from font
* GroupTagTests_MultiRanges.xlsx - remove charset from font
* GroupTagTests_WithHeader.xlsx - sort orders in each group, remove charset from font
* GroupTagTests_Collapse.xlsx - sort orders in each group, remove charset from font
The 9_plaindata.xlsx template contained two cells with a lot of decimal numbers
(H47 873.27000000000021 and I47 19627.879999999997), but gauges contained
rounded numbers (873.27 and 19627.88).

Therefore, the actual calculated value (from template) that had the number from
template (e.g. 873.27000000000021) while gauge contained 873.27. As a result, there
was a difference and SubtotalTests.PageBreaks plus few others failed.

This change updates the 9_plaindata.xlsx so it contains rounded numbers.

Same thing happened with the 91_plaindata.xlsx template, except test was
SubtotalTests.WithHeaders.
The test failed, because the original solution didn't clear out the <<sum>>
in the option row. The result thus had <<sum>> in the output instead of
empty cell.

I think the whole extra branch should be deleted, but that would require
deep modification of RangeTemplate and I don't know enought to do that.

Adjust solution for #251
@jahav
Copy link
Member Author

jahav commented Jan 8, 2025

@XuJin186 Generally speaking, I intend to keep up with breaking changes in the ClosedXML. As long as test pass with latest version of ClosedXML, that's enough for me.

Example of what broke: unstable -> stable sort of ranges broke a lot of test files, Blank instead of text broke other things.
Things I don't like: there is reflection in various places, use proper parser instead of regex) or improve clear performance bottlenecks. But that gets a low priority compared to the work on ClosedXML.

As for this PR, it's nearly done (3 fails due to bug in ClosedXML). It needs a one fix to ClosedXML that is somewhere in a PR and there will be (rc) release.

@XuJin186
Copy link

XuJin186 commented Jan 9, 2025

@jahav Thank you for your contribution to this and look forward to your good news.

jahav added 3 commits January 10, 2025 19:57
The pivot structure in 0.103 was completely overhauled and it is now
necessary to set the tabular layout.

Honestly, the whole "create pivot table" feature should be scrapped.

It's a wrong way to go about a generating a pivot table. The pivot table
should be created in Excel and should reference the range. That way, user
would have full control over all ascpets of the pivot table. The only thing
necessary would be to adjust the range and refresh the cache.

Instead, there is a special tag that contains a very limited subset of pivot
table options and the report generator tries to create new pivot table...
That's pointless.
* GroupTagTests_Simple.xlsx - sort orders in each group, remove charset from font
* GroupTagTests_Simple_WithOutsideLink.xlsx - sort orders in each group, remove charset from font

These fixes also require 0.105-rc version of ClosedXML.
@jahav jahav marked this pull request as ready for review January 25, 2025 15:45
@XuJin186
Copy link

XuJin186 commented Feb 7, 2025

@jahav I think all the tests have passed. You can release an RC version first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants