Move logs, SARIF, database bundle actions uploads to post: hooks #1159

angelapwen · 2022-07-29T09:41:41Z

Previously, even with debug mode on, if the init step failed we did not upload the appropriate Actions artifacts. This was because the artifacts were only uploaded in the analyze step.

This change:

moves the uploading of log files to a post: hook in the init step. Regardless of whether the entire workflow was successful or if any steps after init failed, we will upload whatever logfiles we have as an artifact.
moves the uploading of database bundles to a post: hook in the init step. If the database has not been finalized, instead of running the CLI database bundle command, the action directly zips up everything in the database directory.
moves the uploading of SARIF results (if they exist) to a post: hook in the analyze step.

So far the change has been manually tested so that:

when init fails, partial logs from init are uploaded as artifacts, and the partial database bundle is uploaded.
when analyze fails, logs from init and analyze are uploaded as artifacts; the partial database bundle is uploaded; and the SARIF upload succeeds if there is a SARIF file generated in the output folder of the action.
on workflow success, all file types are successfully uploaded.

Testing strategy:

unit tests for the scripts called by post: hooks
integration tests showing what happens when init and analyze fail; will write after Bump @actions/core from 1.8.0 to 1.9.1 in /packages/artifact actions/toolkit#1160 is merged.

Merge / deployment checklist

Confirm this change is backwards compatible with existing workflows.
[N/A] Confirm the readme has been updated if necessary.
Confirm the changelog has been updated if necessary.

angelapwen · 2022-07-29T10:45:00Z

I'm correctly hitting the case where the database isn't finalized now, and just need to implement the zipping of the partial database files. I see documentation for the CLI command but am still not exactly sure which files I'm looking to zip. I'll take a look at the CLI source code but would appreciate any pointers!

Also, I see I have a CI failure on debug artifact which looks legitimate, as that's the code I'm changing. It only fails on ubuntu, as for some reason the step to download the artifact doesn't begin on the Mac runners. It looks like this is because the step to download the artifacts is executed before the post: hooks that upload them, so I'll have to find out how to reorder these.

I see the PR check comes from

codeql-action/pr-checks/checks/debug-artifacts.yml

Line 16 in b100b75

- uses: actions/download-artifact@v3

which executes directly after the analyze step. Not entirely sure how to make sure it executes only after the post: hooks as I guess the entire point is that the post: hooks happen after everything else. 🤔

angelapwen · 2022-07-29T13:16:07Z

I think perhaps splitting the PR check file into 2 jobs, one for running everything before checking artifact downloads, and one for just checking artifact downloads with a needs dependency on the former, might force things into the right order. Similar to the code blocks in the docs at https://docs.github.com/en/actions/learn-github-actions/essential-features-of-github-actions. I'll try to do this in a separate PR and see if it's at least a no-op without these new post: hook changes.

src/actions-util.ts

aeisenberg · 2022-07-29T16:34:02Z

src/actions-util.ts

+  let suffix = "";
+  const matrix = getRequiredInput("matrix");
+  if (matrix !== undefined && matrix !== "null") {
+    for (const entry of Object.entries(JSON.parse(matrix)).sort())


What happens if matrix is not valid json? How is the error handled?

Hm — I actually didn't write this (just moved it to another section of the codebase) but that's a good question. I had thought that it was forced to be JSON from where it was declared in inputs:

codeql-action/init/action.yml

Lines 14 to 15 in 19d025e

matrix:

default: ${{ toJson(matrix) }}

and

https://github.com/github/codeql-action/blob/main/analyze/action.yml#L67-L68

but it seems like that's the default input and the user could theoretically have passed non-valid JSON.

I am adding a catch block with an error message indicating that there was an error parsing the matrix input and the debug artifact will be uploaded without the matrix input in its name. It seems like the only thing this block of code does is add a suffix to the artifact name, for every value in the matrix input, so it should be able to still upload the artifact correctly (just without the appropriate suffix(es).

src/actions-util.ts

src/analyze-action-cleanup.ts

aeisenberg · 2022-07-29T16:42:00Z

src/init-action-cleanup.ts

+  }
+}
+
+async function uploadLogsDebugArtifact(config: Config) {


Same comment here about making this module as small as possible, and passing in actionsUtil.uploadDebugArtifacts as a function parameter.

Co-authored-by: Andrew Eisenberg <[email protected]>

angelapwen · 2022-08-01T10:29:03Z

I've made most of the requested changes, and will work on the refactoring for unit tests.

I also made the change to zip up the database bundle and it seems to be working as expected on a failed run.

The core.info logs state:

db-javascript is not finalized. Uploading partial database bundle at /home/runner/work/_temp/codeql_databases/db-javascript-partial.zip...

and then the zip file is uploaded as part of the artifacts. I can't attach it here due to file size limits but here is the directory structure of the partial database bundle file when unzipped:

angelapwen · 2022-08-01T11:19:23Z

I added //TODOs in locations where I am planning to unit test, with the description of what each unit test should do, to get feedback on the planned tests before/while I write them.

adityasharad

Good work! The overall logic looks sensible -- suggestions mostly about clarity and readability.

Let me know when you've added some tests and would like another look.

adityasharad · 2022-08-01T17:26:34Z

src/actions-util.ts

 import * as core from "@actions/core";
 import * as toolrunner from "@actions/exec/lib/toolrunner";
 import * as safeWhich from "@chrisgavin/safe-which";
+import AdmZip from "adm-zip";


We already use the zlib library; I think we could reuse that in this situation rather than adding a dependency. One difference is that zlib produces gzip.

Ah interesting. I will give this a try!

Ah, I remembered that I had some trouble with zlib when I tried it out earlier because it seems to only compress streams of data, and it makes preserving the directory structure difficult. I looked around 👀 and see another package that can create zipped directories using zlib for compression: https://github.com/archiverjs/node-archiver

But as this would also add a new dependency I'm not sure if it's worthwhile 🤔

Let's stick with what you've got for now. There is an astounding number of ways to do this in the npm ecosystem, and I don't have the expertise/familiarity to know whether any one is best, so I won't hold this up. adm-zip is widely used and has no other dependencies, which is nice. (In the VS Code extension I see we use zip-a-folder.)

src/analyze-action-cleanup.ts

src/init-action-cleanup.ts

src/actions-util.ts

angelapwen · 2022-08-08T11:38:07Z

Quickly closing and re-opening to trigger new PR checks from actions/toolkit#1160

…cleanup

adityasharad · 2022-08-09T16:24:47Z

That looks good. The other option I considered was turning this workflow off on PRs, and only running it on push events. Let's keep this for now, and if the "failing" check confuses other contributors we can adjust it.

…cleanup

adityasharad

Some minor suggestions while you polish up the tests.

src/analyze-action-post.ts

src/actions-util.ts

src/debug-artifacts.ts

src/actions-util.ts

adityasharad · 2022-08-02T22:15:45Z

src/actions-util.ts

 import * as core from "@actions/core";
 import * as toolrunner from "@actions/exec/lib/toolrunner";
 import * as safeWhich from "@chrisgavin/safe-which";
+import AdmZip from "adm-zip";


Let's stick with what you've got for now. There is an astounding number of ways to do this in the npm ecosystem, and I don't have the expertise/familiarity to know whether any one is best, so I won't hold this up. adm-zip is widely used and has no other dependencies, which is nice. (In the VS Code extension I see we use zip-a-folder.)

src/util.test.ts

CHANGELOG.md

src/actions-util.ts

aeisenberg · 2022-08-10T20:27:57Z

src/analyze-action-post.ts

+import * as debugArtifacts from "./debug-artifacts";
+import { getActionsLogger } from "./logging";
+
+async function run(uploadSarifDebugArtifact: Function) {


For easier testing, you can move this function to a different file and test that file instead. This way, you avoid invoking runWrapper by default.

I've done this, though I didn't know what to call the other file (not sure what patterns we want to follow). I called it analyze-action-post-helper but would like to hear what you think!

aeisenberg · 2022-08-10T20:34:24Z

src/debug-artifacts.ts

+
+/**
+ * If a database has not been finalized, we cannot run the `codeql database bundle`
+ * command in the CLI because it will return an error. Instead we directly zip


That's unfortunate and avoids a good use case for codeql database bundle. Do you know why it throws? I wonder if we could add a --force option (or something similar) to allow the command to succeed even if the database is malformed.

I did a little digging into the CLI source code, but I thought it'd be too complex to somehow catch the error thrown in the action and then do something different. A --force option would be interesting though and isn't something I'd considered.

We'd like to get this change in this week but I can write up a follow-up issue on the --force option on the CLI side for our backlog.

Right...there's no need to get this done now.

src/util.ts

Co-authored-by: Aditya Sharad <[email protected]>

Co-authored-by: Andrew Eisenberg <[email protected]>

…cleanup

angelapwen · 2022-08-11T14:11:40Z

The remaining unit tests are up, requesting final review now 😄

After this is merged, I will add the "Download and check debug artifacts after failure in analyze" PR check/job to the required PR checks for each branch. I'll also write up an issue on improving the codeql database bundle command to handle non-finalized databases.

adityasharad

Looks good!

adityasharad · 2022-08-11T14:24:48Z

src/analyze-action-post-helper.test.ts

+
+test("post: analyze action with debug mode off", async (t) => {
+  return await util.withTmpDir(async (tmpDir) => {
+    process.env["RUNNER_TEMP"] = tmpDir;


You may want to save the old value and restore it at the end of this test, so it doesn't affect the environment for later tests. Or turn it into a stub.

Hm — I see it used this way without restoring in several other existing tests like

codeql-action/src/config-utils.test.ts

Line 626 in a6d0901

process.env["RUNNER_TEMP"] = tmpDir;

Is there a reason this one would act differently from other tests?

Since it's used elsewhere in this way, it's probably fine. I would recommend in the future that we do what Aditya suggests in all the tests. The danger if we don't do this is there may be hard to spot dependencies between tests that use this environment variable.

I see we also have this helper function that is called quite a lot

codeql-action/src/testing-utils.ts

Line 102 in a6d0901

export function setupActionsVars(tempDir: string, toolsDir: string) {

although we don't need the other two environment variables here.

Hm, does this line

codeql-action/src/testing-utils.ts

Line 96 in cade2b5

process.env = t.context.env;

effectively reset the environment vars after each test?

Yes...it does. I also see sinon.restore();, which makes my comment about using a sandbox irrelevant.

src/analyze-action-post-helper.ts

angelapwen · 2022-08-11T14:55:44Z

src/analyze-action-post-helper.test.ts

+    } as unknown as configUtils.Config);
+
+    const requiredInputStub = sinon.stub(actionsUtil, "getRequiredInput");
+    requiredInputStub.withArgs("output").returns("fake-output-dir");


On a related note, I see that in some places we use ${STUB}.restore() to clean up at the end of a test, but in others we don't. Is this necessary or best practice?

Same as above, I think this line

codeql-action/src/testing-utils.ts

Line 93 in cade2b5

sinon.restore();

resets the sinon stubs after each test?

Actually...we should do that everywhere. The vscode extension uses a different (and better) idiom, which we should move to in the action.

Instead of creating stubs directly using sinon.stub, use a sandbox:

let sandbox: sinon.SinonSandbox; beforeEach(() => { sandbox = sinon.createSandbox(); }); afterEach(() => { sandbox.restore(); });

And create stubs using: sandbox.stub(). This ensures all stubs are removed after every test runs. In the action, we are not doing this. So far, it hasn't caused any problems, but it might later.

For now it looks like as long as we call

codeql-action/src/testing-utils.ts

Line 49 in cade2b5

export function setupTests(test: TestFn<any>) {

in each test file it should call sinon.restore() and also restore the environment variables, right?

Yes....I missed that. So, you can ignore my comment.

Ok, great! This was educational 🧑‍🎓

aeisenberg · 2022-08-11T14:53:12Z

src/analyze-action-post-helper.test.ts

+
+test("post: analyze action with debug mode off", async (t) => {
+  return await util.withTmpDir(async (tmpDir) => {
+    process.env["RUNNER_TEMP"] = tmpDir;


Since it's used elsewhere in this way, it's probably fine. I would recommend in the future that we do what Aditya suggests in all the tests. The danger if we don't do this is there may be hard to spot dependencies between tests that use this environment variable.

aeisenberg · 2022-08-11T14:57:34Z

src/debug-artifacts.ts

+
+/**
+ * If a database has not been finalized, we cannot run the `codeql database bundle`
+ * command in the CLI because it will return an error. Instead we directly zip


Right...there's no need to get this done now.

Move logs, SARIF actions uploads to post: hooks

1016eba

angelapwen requested a review from a team as a code owner July 29, 2022 09:41

Catch case where database isn't finalized

2746051

angelapwen mentioned this pull request Jul 29, 2022

Split debug artifacts PR check into two jobs #1160

Merged

1 task

aeisenberg reviewed Jul 29, 2022

View reviewed changes

angelapwen and others added 6 commits August 1, 2022 11:24

Zip partial database directory

2c25894

Refactor helper function to util

52de49c

More descriptive partial db bundle messages

ebc59ec

Improve for matrix

af87cc6

Co-authored-by: Andrew Eisenberg <[email protected]>

Minor syntax update

6630cbe

Co-authored-by: Andrew Eisenberg <[email protected]>

Error handling for JSON parsing

8a4a573

angelapwen changed the title ~~Move logs, SARIF actions uploads to post: hooks~~ Move logs, SARIF, database bundle actions uploads to post: hooks Aug 1, 2022

angelapwen added 2 commits August 1, 2022 12:52

Refactoring per PR comments

5da7870

Add unit test descriptions

5229df1

Linting, node_modules update

daaac43

adityasharad reviewed Aug 1, 2022

View reviewed changes

angelapwen added 6 commits August 2, 2022 12:01

Clean up syntax per PR review

a557279

Add top level comments, rename cleanup to post

44a27e6

Address more PR comments, refactoring

5895ab0

Move debug artifact methods into separate file

eeee462

Add more info messages to user, rename log printing function

a758ec5

Move debug log printing back to actions util

7f86ddc

angelapwen closed this Aug 8, 2022

angelapwen reopened this Aug 8, 2022

Merge remote-tracking branch 'origin/main' into angelapwen/post-init-…

010abe7

…cleanup

angelapwen force-pushed the angelapwen/post-init-cleanup branch from ea909e8 to ff7a29d Compare August 10, 2022 10:11

angelapwen added 3 commits August 10, 2022 14:57

Add utilities unit tests

484a72c

Merge remote-tracking branch 'origin/main' into angelapwen/post-init-…

90676d9

…cleanup

Re-declare codeql var

3c4f458

adityasharad reviewed Aug 10, 2022

View reviewed changes

aeisenberg reviewed Aug 10, 2022

View reviewed changes

angelapwen and others added 9 commits August 11, 2022 13:45

Address review comments

65d6ee0

Update CHANGELOG.md wording

fa59c28

Co-authored-by: Aditya Sharad <[email protected]>

Update comment wording

d909f71

Co-authored-by: Andrew Eisenberg <[email protected]>

Address additional review comments

4e121c0

Improve file system unit tests

6fdaff6

Merge remote-tracking branch 'origin/main' into angelapwen/post-init-…

15608ce

…cleanup

Add unit tests for post: hook run methods

26cafd2

Remove extraneous files

fd83e55

Improve doesDirectoryExist test

172eca4

angelapwen requested review from aeisenberg and adityasharad August 11, 2022 14:11

adityasharad approved these changes Aug 11, 2022

View reviewed changes

angelapwen added 2 commits August 11, 2022 16:46

Make file paths OS-agnostic

cf7f893

Remove review comments

79b933c

angelapwen commented Aug 11, 2022

View reviewed changes

aeisenberg approved these changes Aug 11, 2022

View reviewed changes

angelapwen merged commit b659ce5 into main Aug 11, 2022

angelapwen deleted the angelapwen/post-init-cleanup branch August 11, 2022 16:00

angelapwen mentioned this pull request Aug 12, 2022

Add expect-error input to force PR check green on expected failure #1177

Merged

3 tasks

This was referenced Aug 17, 2022

Merge main into releases/v2 #1189

Closed

Merge main into releases/v2 #1192

Merged

Merge releases/v2 into releases/v1 #1195

Merged

Move logs, SARIF, database bundle actions uploads to post: hooks #1159

Move logs, SARIF, database bundle actions uploads to post: hooks #1159

Conversation

angelapwen commented Jul 29, 2022 • edited Loading

Merge / deployment checklist

angelapwen commented Jul 29, 2022 • edited Loading

angelapwen commented Jul 29, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angelapwen commented Aug 1, 2022

angelapwen commented Aug 1, 2022

adityasharad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angelapwen commented Aug 8, 2022

adityasharad commented Aug 9, 2022

adityasharad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angelapwen commented Aug 11, 2022 • edited Loading

adityasharad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angelapwen Aug 11, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

angelapwen commented Jul 29, 2022 •

edited

Loading

angelapwen commented Jul 29, 2022 •

edited

Loading

angelapwen commented Aug 11, 2022 •

edited

Loading

angelapwen Aug 11, 2022 •

edited

Loading