Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go: Make flow configurations use new data flow API #13820

Merged
merged 64 commits into from
Aug 11, 2023

Conversation

owen-mc
Copy link
Contributor

@owen-mc owen-mc commented Jul 26, 2023

I highly recommend reviewing commit by commit (and spreading it out over several sessions). All configurations have tests (normally of the related query). There are no changes in results, though sometimes there are changes in exact PathNodes and edges between them.

@owen-mc owen-mc requested a review from a team as a code owner July 26, 2023 09:30
@github-actions github-actions bot added the Go label Jul 26, 2023
@owen-mc owen-mc force-pushed the go/refactor-data-flow-configurations branch from 85baa81 to 1d2c268 Compare July 26, 2023 09:34
@owen-mc owen-mc added the no-change-note-required This PR does not need a change note label Jul 26, 2023
@owen-mc owen-mc force-pushed the go/refactor-data-flow-configurations branch from 1d2c268 to 028eb9b Compare July 26, 2023 09:51
@owen-mc owen-mc marked this pull request as draft July 26, 2023 10:25
@owen-mc owen-mc force-pushed the go/refactor-data-flow-configurations branch from 028eb9b to 9919221 Compare July 26, 2023 21:18
@owen-mc owen-mc force-pushed the go/refactor-data-flow-configurations branch from 9919221 to 355f9c6 Compare July 27, 2023 06:13
@owen-mc owen-mc marked this pull request as ready for review July 27, 2023 06:14
@owen-mc
Copy link
Contributor Author

owen-mc commented Jul 27, 2023

I will rebase on main when #13822 is merged, which will fix the sync files check.

@owen-mc owen-mc force-pushed the go/refactor-data-flow-configurations branch 2 times, most recently from d8ed7fe to 9376ead Compare August 1, 2023 07:04
Copy link
Member

@mbg mbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work updating all of this! ✨ The refactoring generally looks reasonable. There are a few specific comments I had while looking over this and two more broad ones:

  • For queries where the expected test output changed and you haven't done so yet, it would be good to note why it changed in the commit description or in the PR description.
  • Most/all of the old data/taint flow configurations had docs comments explaining their purpose. However, the new ConfigSig implementations and data/taint flow configurations mostly do not have matching docs comments. In general, it would be nice to have more documentation/comments.

There also seem to a good few ConfigSigs which are basically the same as a result of the [..]Customizations approach along the lines of:

private module Config implements DataFlow::ConfigSig {
    predicate isSource(DataFlow::Node source) { source instanceof Source }

    predicate isSink(DataFlow::Node sink) { sink instanceof Sink }

    predicate isBarrier(DataFlow::Node node) { node instanceof Barrier }
}

Would it be possible (in the future, not as part of this PR) to refactor this so that we have a generic implementation that we can just parameterise over the module that's providing the Source, Sink, and Barrier?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For queries where the expected test output has changed, it would be good to include some discussion in the PR description why this has changed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It generally isn't obvious why. I know one cause is when there were more than one config in scope in the old version. I'm not sure how much value there is in tracking it down. I'm going to bet that in the corresponding PRs for java and C# they have similar changes and they haven't looked into what caused them. (Of course if there was a change in the results it would be different. It's just because the only change is in path nodes and edges.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we get the edges here but didn't before?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How come we get the edges here but didn't before?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the differences?

@@ -148,6 +150,115 @@ class ConversionWithoutBoundsCheckConfig extends TaintTracking::Configuration {
}
}

/** Flow state for ConversionWithoutBoundsCheckConfig. */
newtype MyFlowState =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have a less generic name, e.g. IntegerConversionState?

Comment on lines +6 to +9
source =
any(Function f | f.getName() = ["getUntrustedString", "getUntrustedStruct"])
.getACall()
.getResult()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I note that the test previously ranged over all of UntrustedFlowSource but now is an inlined UntrustedSource. Do we miss out anything important by not having this range over all of UntrustedFlowSource anymore?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so - we're just trying to match the two functions used as sources in the test files. I can only imagine was written like this to be more like a real source definition.

Comment on lines +6 to +8
exists(Function fn | fn.hasQualifiedName(_, ["getUntrustedString", "getUntrustedBytes"]) |
source = fn.getACall().getResult()
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This previously ranged over all of UntrustedFlowSource, but is now more limited. Do we miss out on anything as a result of this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I don't think it matters for tests.

@@ -9,86 +9,88 @@ import (
//go:generate depstubber -vendor k8s.io/apimachinery/pkg/runtime ProtobufMarshaller,ProtobufReverseMarshaller

func source() interface{} {
return make([]byte, 1, 1)
return make([]byte, 1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My IDE told me it was better to do it this way. Presumably it's a built-in Go linter of some kind.

)
}

int fieldFlowBranchLimit() { result = 1000 }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment noting why you added this here for this particular configuration?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't needed. I've removed it. I must have added it by mistake, probably copying from somewhere else.

@owen-mc
Copy link
Contributor Author

owen-mc commented Aug 7, 2023

I believe I've addressed all comments except explaining the changes in test expectations.

@owen-mc owen-mc force-pushed the go/refactor-data-flow-configurations branch from 825b716 to 9612740 Compare August 9, 2023 21:05
@owen-mc
Copy link
Contributor Author

owen-mc commented Aug 9, 2023

I've added explanations of test result changes in the relevant commit messages. I had to force-push to do this, so I rebased on main at the same time. There were no changes to the commits except the commit messages.

@owen-mc owen-mc force-pushed the go/refactor-data-flow-configurations branch 2 times, most recently from 0bd3ce8 to 19dbe79 Compare August 11, 2023 08:54
owen-mc and others added 2 commits August 11, 2023 10:06
We preserve all old QLDocs, but move them from the
config to the Flow module. This makes more sense than
the Config module, which is often private, and is generally
not directly accessed.
@owen-mc owen-mc force-pushed the go/refactor-data-flow-configurations branch from 19dbe79 to 35a300f Compare August 11, 2023 09:06
Copy link
Member

@mbg mbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢

@owen-mc owen-mc merged commit c10d03e into github:main Aug 11, 2023
10 checks passed
@owen-mc owen-mc deleted the go/refactor-data-flow-configurations branch August 11, 2023 09:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Go no-change-note-required This PR does not need a change note
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants