-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce mcdc::TVIdxBuilder (LLVM side, NFC) #80676
Conversation
This accept current version of profdata. The output might be different. See also https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
…de)" This reverts commit d168e0c.
Deprecate `TestVectors`, since no one uses it. This affects the output order of ExecVectors. The current impl emits sorted by binary value of ExecVector. This impl emits along the traversal of `buildTestVector()`.
This accepts current version of profdata. The output might be different. See also https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798
I reworked to align the current implementation. |
Nodes[0].Width = 1; | ||
Q.push_back(0); | ||
|
||
unsigned Ord = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a bit more commentary on the overall process of the algorithm (and MCDCTVIdxBuilder)? What are the expected inputs and the expected output, generally? I think it would help with the readability a bit more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added comments and the unittest. Hope it helps!
// `Index` encodes the bitmask of true values and is initially 0. | ||
MCDCRecord::TestVector TV(NumConditions, MCDCRecord::MCDC_DontCare); | ||
buildTestVector(TV, 1, 0); | ||
buildTestVector(TV, 0, 0, 0); | ||
assert(TVIdxs.size() == NumTestVectors && "TVIdxs wasn't fulfilled"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look. I will update this evening.
public: | ||
struct MCDCNode { | ||
int InCount = 0; /// Reference count; temporary use | ||
int Width; /// Number of paths (>= 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not name it NumPaths
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was W
in my prototype design. I have to update my imagination if this would be renamed.
Or, does the comment mislead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://en.wikipedia.org/wiki/Width_(disambiguation) Width has many meanings for graphs. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is better for me to rename to easy N
or to define Width
here.
I haven't learnt the graph theory though, looks like I have introduced similar concept, as far as I read introductions of the theory.
Let me think more.
* Split out `Indices[ID][Cond]` * Let `Nodes` debug-only. * Introduce `Offset` * Introduce `HardMaxTVs`
* Sink `TVIdxBuilder` into `mcdc::`. * The ctor accepts `SmallVector<ConditionIDs>` indexed by `ID`. * `class NextIDsBuilder` provides `NextIDs` as`SmallVector<ConditionIDs>`, for `TVIdxBuilder` to use it before `MCDCRecordProcessor()`. It was `BranchParamsMap` or `Map` as `DenseMap<Branch>`. * `NodeIDs` and `Fetcher` function are deprecated.
Indices[ID][I] = NextNode.Width; | ||
auto NextWidth = int64_t(NextNode.Width) + Node.Width; | ||
if (NextWidth > HardMaxTVs) { | ||
NumTestVectors = HardMaxTVs; // Overflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add debug output when there is an overflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess overflow is impossible in llvm-cov.
OTOH, Clang side will catch such a overflow and tell it with HardMaxTVs
for warnings/errors. See MCDCCoverageBuilder
in CoverageMappingGen.cpp
, and mcdc-error-conditions.cpp
.
I wonder it would be valuable to add debug message here.
} | ||
} | ||
|
||
std::sort(Decisions.begin(), Decisions.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use llvm::sort?
I think it takes ranges so you can just do llvm::sort(Decisions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've forgot it.
} | ||
} | ||
|
||
// Sort key ordered by <-Width, Ord> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excuse me. Why end nodes of the decision need to be sorted in descending order of width?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it may be available for integrating branch and mcdc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So motivation of the sort is to prepare for truncating all conditions with width=1
, is it?
If so, why not just remove them then with an O(n) method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you think they could be removed? It's the prerequisite to assign identical IDs between clang and llvm-cov.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, if we did not sort them, we can not reduce size of bitmaps effectively since there might be other conditions with width>1
getting large index.
I'm trying to remove the sort because I find it breaks some nature if the topological relationship of conditions is abnormal (e.g conditions from pattern matching in rust) and generates troubled index maps. I would investigate it more. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Could you introduce me the discussion, or paste the problematic cond tree anywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flowchart TD
C0 -- T --> C1
C1 -- F --> F1
C1 -- T --> C3
C0 -- F --> C2
C2 -- T --> C3
C2 -- F --> F2
C3 -- F --> C4
C3 -- T --> T3
C4 -- T --> T4
C4 -- F --> F4
Normal boolean expressions can not generate such decision tree. In normal boolean decisions, either true next of a condition is a next (true or false) of its false next, or false next of a condition is a next of its true next.
I tried to disable the sort and fixed some cases. So I guessed something was broken by the sort. But that's wrong. The real problem is the way to update cond bitmap, which is specified for previous cond bitmap but is inappropriate for current. This implementation still works for pattern matching, thanks to your excellent efforts!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a glance, F1
and F2
have w=1
, and others have w=2
. Correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The index map is right. The key difference is at most one of C0
and C2
can be true, which not likes (C0 && C1) || C2
. In previous I updated false bit of C2
when C0
is true since if C0
is true, C2
must be false. While at present we can not do this.
Sure, I’d collate it. Please wait some time.
|
This is a preparation of incoming Clang changes (#82448) and just checks
TVIdx
is calculated correctly. NFC.TVIdxBuilder
calculates deterministic Indices for each Condition Node. It is used forclang
to emitTestVector
indices (aka ID) and forllvm-cov
to reconstructTestVectors
.This includes the unittest
CoverageMappingTest.TVIdxBuilder
.See also
https://discourse.llvm.org/t/rfc-coverage-new-algorithm-and-file-format-for-mc-dc/76798