Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds another rule for null deref #16524

Merged
merged 4 commits into from
May 22, 2024
Merged

Conversation

catenacyber
Copy link
Contributor

Inspired by OISF/suricata#11098

Copy link
Contributor

QHelp previews:

cpp/ql/src/experimental/Likely Bugs/DerefNullResult.qhelp

Null dereference from a function result

This rule finds a dereference of a function parameter, whose value comes from another function call that may return NULL, without checks in the meantime.

Recommendation

A check should be added between the return of the function which may return NULL, and its use by the function dereferencing ths pointer.

Example

char * create (int arg) {
    if (arg > 42) {
        // this function may return NULL
        return NULL;
    }
    char * r = malloc(arg);
    snprintf(r, arg -1, "Hello");
    return r;
}

void process(char *str) {
    // str is dereferenced
    if (str[0] == 'H') {
        printf("Hello H\n");
    }
}

void test(int arg) {
    // first function returns a pointer that may be NULL
    char *str = create(arg);
    // str is not checked for nullness before being passed to process function
    process(str);
}

References

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @catenacyber,

Thanks for your contribution. It looks like a really sensible query!

I've got one comment, but otherwise I think this looks good to me.

CI is currently failing because the query isn't formatted according to our style guide. See the Formatting section in https://github.com/github/codeql/blob/main/CONTRIBUTING.md for how to automatically format the file to fit our style guide.

*/

import cpp
import semmle.code.cpp.dataflow.DataFlow
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

semmle.code.cpp.dataflow.DataFlow has been deprecated in favor of semmle.code.cpp.dataflow.new.DataFlow. Would it be possible to use that library instead? It should just be a matter of changing the import since the new library should be strictly better than the old one 🤞

Suggested change
import semmle.code.cpp.dataflow.DataFlow
import semmle.code.cpp.dataflow.new.DataFlow

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed that

@catenacyber
Copy link
Contributor Author

Thanks for your contribution. It looks like a really sensible query!

Thanks @MathiasVP

I've got one comment, but otherwise I think this looks good to me.

CI is currently failing because the query isn't formatted according to our style guide. See the Formatting section in https://github.com/github/codeql/blob/main/CONTRIBUTING.md for how to automatically format the file to fit our style guide.

Just changed that in the latest commit.
Did not manage to do the formatting with vscode, but ran codeql query format --in-place manually

@catenacyber
Copy link
Contributor Author

catenacyber commented May 20, 2024

Some comments on the results on the top 100 projects.

Most seem to come from ignoring that malloc or such can return NULL.

I found at least one false positive, where besides the buffer being returned, there is a size, and the size is tested instead of the pointer for the NULL/0 case...

And there also other true positives ;-)

// there is a function call which will deref parameter pd
fc.getTarget() = pd.getFunction() and
// the parameter pd comes from a variable v
DataFlow::localFlow(DataFlow::exprNode(v.getAnAccess()),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the right way to express it ?
I would also like that process(create(arg)); from upper example is found even if there is no intermediary variable

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do local dataflow from the expression node corresponding to a call to nuller and to the argument of fc. This would remove the need for v completely.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I do
DataFlow::localFlow(DataFlow::exprNode(nuller.getACallToThisFunction()), DataFlow::exprNode(fc.getArgument(pd.getIndex())))

I get a lot of false positives from the right use cases where the nuller return value is checked before calling fc

not exists(VariableAccess vc |
vc.getTarget() = v and
(vc.getParent() instanceof Operation or vc.getParent() instanceof IfStmt)
)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we test that the variable is assigned only once ?
So as not to have FP with

    char *str = create_unsafe(arg);
    process_safe(str);
    str = create_safe(arg);
    process_unsafe(str);

Copy link
Contributor

@MathiasVP MathiasVP May 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do wrap v.getAnAssignedValue() in a unique such as:

unique(| | v.getAnAssignedValue()) = nuller.getACallToThisFunction()

Note, however, that it may be better to simply remove the need for v completely as I suggested here 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just did that as the other solution does not work out so good ;-)

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now!

@MathiasVP MathiasVP merged commit 00a940f into github:main May 22, 2024
14 checks passed
@catenacyber
Copy link
Contributor Author

Thanks Mathias, what is next ? (for https://securitylab.github.com/bounties/)

For instance, I see multiple projects use https://github.com/nothings/stb and https://github.com/nothings/stb/blob/master/stb_image_write.h#L767 malloc return is unchecked when other malloc returns are checked

I think the google/re2 finding is a FP due to functionDereferences stating that https://github.com/google/re2/blob/b7e96b34c0945fccb8b5282404f82c7ab0843717/re2/prefilter_tree.cc#L39 dereferences its only argument...

@MathiasVP
Copy link
Contributor

Ah! I was not aware that you intended to have this be part of the bounty program 🙂

I guess you want to continue to point 4 in https://securitylab.github.com/bounties/:

Create an issue using the all for one template and a detailed report on the class of vulnerabilities your query is intended to find.

@catenacyber
Copy link
Contributor Author

Ah! I was not aware that you intended to have this be part of the bounty program 🙂

Should I not ?

I guess you want to continue to point 4 in https://securitylab.github.com/bounties/:

Create an issue using the all for one template and a detailed report on the class of vulnerabilities your query is intended to find.

Ok, will do

@catenacyber
Copy link
Contributor Author

github/securitylab#826

@MathiasVP
Copy link
Contributor

Should I not ?

That's fine. The SecurityLab team will take it from here 🤞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants