You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It tries to compare binary and strings in the same group, however it uses a different ordering for string/string comparisons (our utf8_compare) than for bin/bin and bin/string comparisons (the latter two both use memcmp order). This breaks the normal mathematical laws of total orderings such as transitivity (a < b && b < c implies a < c), that a lot of code relies on for correctness. We solved this for Sets with a special case making all strings compare less than all binaries, but that only applied to the comparison internal to Sets. Among other issues, that means that the order of a Set<Mixed> now differs from the order of a sorted List<Mixed> which broke a test in realm-swift. We should solve this problem for all users of Mixed once and for all.
I think there are 3 reasonable options, but only the first avoids a file format bump, assuming we want Set to use the new order:
Make Mixed comparison always consider strings less than binaries
Use memcmp order for strings (which matches codepoint order because UTF8 is great) rather than our utf8_compare (which is broken)
Do both (change the order for strings, but still keep them separate from binaries)
As proof that the existing order is broken, and not just odd, here is a test case showing that it violates transitivity:
TEST(Mixed_string_binary_compare) {
constauto a = Mixed("a");
constauto b = Mixed("B");
constauto c = Mixed(BinaryData("C", 1));
// For a total order at least one of these must be fail!CHECK_LESS(a.compare(b), 0);
CHECK_LESS(b.compare(c), 0);
CHECK_LESS(c.compare(a), 0);
}
And here is the result of passing all 6 permutations of those values to std::sort():
It tries to compare binary and strings in the same group, however it uses a different ordering for string/string comparisons (our
utf8_compare
) than for bin/bin and bin/string comparisons (the latter two both use memcmp order). This breaks the normal mathematical laws of total orderings such as transitivity (a < b && b < c
impliesa < c
), that a lot of code relies on for correctness. We solved this for Sets with a special case making all strings compare less than all binaries, but that only applied to the comparison internal to Sets. Among other issues, that means that the order of aSet<Mixed>
now differs from the order of a sortedList<Mixed>
which broke a test in realm-swift. We should solve this problem for all users of Mixed once and for all.I think there are 3 reasonable options, but only the first avoids a file format bump, assuming we want Set to use the new order:
As proof that the existing order is broken, and not just odd, here is a test case showing that it violates transitivity:
And here is the result of passing all 6 permutations of those values to
std::sort()
:Because no values are equal, all 6 outputs should be the same. (Note that "C" is binary data even though we print it like a string)
The text was updated successfully, but these errors were encountered: