-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
C++: Constant type-bounds in the new range analysis #13783
C++: Constant type-bounds in the new range analysis #13783
Conversation
@@ -936,7 +936,7 @@ void two_bounds_from_one_test(short ss, unsigned short us) { | |||
range(ss); // -32768 .. 32767 | |||
} | |||
|
|||
if (ss + 1 < sizeof(int)) { // $ overflow=+ | |||
if (ss + 1 < sizeof(int)) { // $ overflow=- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 verified experimentally
I've verified the results and they LGTM. void f(unsigned int ui) {
unsigned long long ull = ui;
range(ull); // we infer that `ull <= UINT_MAX` here. Is that what we want?
}
Edit: We now let each query do that using the |
Why wouldn't we want to deduce that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small comments, just to further my understanding of the code.
cpp/ql/lib/semmle/code/cpp/rangeanalysis/new/internal/semantic/SemanticExprSpecific.qll
Show resolved
Hide resolved
cpp/ql/test/library-tests/ir/range-analysis/SimpleRangeAnalysis_tests.cpp
Show resolved
Hide resolved
cpp/ql/test/library-tests/ir/range-analysis/SimpleRangeAnalysis_tests.cpp
Show resolved
Hide resolved
Because it's not a very precise bound, and many users have been confused by false positives caused by such very-large-but-sound bounds obtained from type-information only. For example, this change was made because we wanted to be able to distinguish between precise bounds found by bounds from guards, and less precise bounds from type-information, for a high precision version of the And then there's this beauty from an external contributor who had a similar problem: https://github.com/github/codeql/blob/main/cpp/ql/src/experimental/Security/CWE/CWE-561/FindIncorrectlyUsedSwitch.ql#L21 |
Not as part of this PR, but as each bound comes with a reason, we might want to explore whether we can add a new reason for the type bounds that are introduced here. |
The DCA alert changes do not make any kind of sense to me. Any clue what was going on there? |
|
Thanks. I would be fine with having this merged, assuming DCA against |
Good point. I actually didn't test this against |
Uh oh. Since we now infer many many more constant bounds the
I think that's another good reason to do what Jeroen said here:
But since this is causing FPs on |
I've pushed two temporary commits to this PR: 54b903c and 59b7d3e. Together, these filter out type-based bounds from the two MCTV queries that DCA showed had new FPs as a result of these changes. If DCA comes back happy I'll move those commits commits to another PR so that we can merge that one before continuing with this PR. |
44a88b8
to
36ffa12
Compare
Sorry I didn't see the notification for this until now. I'll have a quick look at the DCA alert differences on the latest run, particularly for |
Please don't do so yet. I'm still fixing things. I'll move this to a draft to make this explicit. |
My long series of TEMP commits seems to have removed all the type-related FPs, and we're now left with four new results. I thought these were FPs similar to the other ones I've fixed in the earlier commits, but these turn out to be genuine (I mean ... they're still a FP wrt. the query, but the range analyses reported are all correct). For example, one of the four new alerts is on this line where we report that the However! After a long and painful debugging session I found this line which is the source of this reported TLDR: It seems like things work now 👍. I'll cleanup this PR and take it out of draft later today. |
I've moved all the TEMP commits to a fresh PR: #13880. Once that PR is merged I can rebase this PR and pull this out of draft 🎉 |
1822a67
to
2d832db
Compare
DCA looks good. We have the same set of results as the very first DCA result (all of which we verified). There was a slowdown on Wireshark which I could reproduce locally, and a subsequent DCA run didn't show any slowdown either. So I think this PR is good to go! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question, otherwise LGTM.
predicate hasConstantBound(SemExpr e, float bound, boolean upper) { none() } | ||
predicate hasConstantBound(SemExpr e, float bound, boolean upper, SemReason reason) { | ||
semHasConstantBoundConstantSpecific(e, bound, upper) and | ||
reason instanceof SemTypeReason |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've trouble following this. Why os this restricted to SemTypeReason
only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because semHasConstantBoundConstantSpecific
only infers bounds based on type information. It's probably possible to push this condition into semHasConstantBoundConstantSpecific
, but that then requires that semHasConstantBoundConstantSpecific
knows about the SemReason
type which it currently doesn't since that predicate is defined outside the CppLangImplConstant
module (which instantiates those types).
Does that make sense?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So zero bounds are not constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
semHasConstantBoundConstantSpecific
just forwards to hasConstantBoundConstantSpecific
(which is the predicate added here). The only thing that adds is the type bounds inferred from type conversions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We briefly discussed this face-to-face. This is the correct thing to do here because as Mathias writes:
semHasConstantBoundConstantSpecific
only infers bounds based on type information
We could extend this later and push down the reason
argument, but that is not of immediate concern here.
This PR adds type-based bounds in the new range analysis library. This allows us to deduce that fewer things overflow, and thus allows us exclude fewer bounds.
I'll test this by locally rebasing this branch onto #12505 and running DCA on that.