Add baseline metrics for lines of code #459

aeisenberg · 2021-04-22T22:51:22Z

This commit uses a third party library to estimate the lines of code in
a database that is to be analyzed by codeql.

The estimate uses the same includes and excludes globs for determining
which files should be counted.

The lines of code count is returned by language and injected into the
SARIF as appropriate.

~~Currently, this PR adds the LoC data in the metricResults property of the sarif in a blob like this:~~

      {
          metric: `baseline/${language}/lines-of-code`,
          value: lineCounts[language]
        }

~~We haven't agreed on what this will look like, so injecting the metric may change.~~

We've decided that the lines of code will be injected into metrics with id like this: ${language}/summary/lines-of-code and a new baseline property is added to the metric. Languages that we have a count for, but no metric, will be ignored.

Merge / deployment checklist

Confirm this change is backwards compatible with existing workflows.
Confirm the readme has been updated if necessary.

robertbrignull · 2021-04-23T10:11:10Z

src/analyze.ts

@@ -145,6 +146,15 @@ export async function runQueries(
 ): Promise<QueriesStatusReport> {
  const statusReport: QueriesStatusReport = {};

+  // count the number of lines in the background
+  const locPromise = countLoc(


This is clever, so the promise it's only resolved when it's used but is used in potentially multiple places. Worth noting that in practice there will always be some queries to evaluate so we'll always end up using this promise. It's an error for there not to be any queries to analyse and it would have error-ed back in the init step. Up to you if you therefore want to leave this as it is or potentially simplify it.

The main reason why I'm doing this is so that the line counting can happen in "parallel". On large projects, counting can take 10-20s and there's lots of disk IO, so it's nice to be able to run this while other things are happening.

src/count-loc.ts

adityasharad

Nice and clear. Couple of recommendations based on CodeQL conventions for labelling languages.

src/count-loc.ts

robertbrignull

LGTM from my point of view. Probably best let @adityasharad also review the recent changes.

src/count-loc.ts

src/analyze.ts

src/analyze.test.ts

This commit uses a third party library to estimate the lines of code in a database that is to be analyzed by codeql. The estimate uses the same includes and excludes globs for determining which files should be counted. The lines of code count is returned by language and injected into the SARIF as `baseline` property in the `${language}/summary/lines-of-code` metric.

aeisenberg marked this pull request as draft April 22, 2021 22:51

aeisenberg force-pushed the aeisenberg/add-github-linguist branch from c5d6cae to d2b4652 Compare April 22, 2021 22:54

aeisenberg force-pushed the aeisenberg/add-linguist-data branch from 5c4018b to d6f3eb6 Compare April 22, 2021 22:57

aeisenberg force-pushed the aeisenberg/add-github-linguist branch from d2b4652 to c4a84a9 Compare April 22, 2021 22:59

aeisenberg force-pushed the aeisenberg/add-linguist-data branch from d6f3eb6 to 76702a2 Compare April 22, 2021 23:00

robertbrignull reviewed Apr 23, 2021

View reviewed changes

aeisenberg force-pushed the aeisenberg/add-linguist-data branch 2 times, most recently from 74c1aef to 5201fb4 Compare April 23, 2021 17:55

aeisenberg marked this pull request as ready for review April 23, 2021 17:57

Base automatically changed from aeisenberg/add-github-linguist to main April 23, 2021 17:59

aeisenberg force-pushed the aeisenberg/add-linguist-data branch from 5201fb4 to a1e16be Compare April 23, 2021 18:04

adityasharad reviewed Apr 23, 2021

View reviewed changes

src/count-loc.ts Show resolved Hide resolved

src/count-loc.ts Outdated Show resolved Hide resolved

aeisenberg force-pushed the aeisenberg/add-linguist-data branch 2 times, most recently from 0ce85b6 to 674720b Compare April 23, 2021 21:59

robertbrignull approved these changes Apr 26, 2021

View reviewed changes

aeisenberg assigned adityasharad Apr 26, 2021

adityasharad reviewed Apr 26, 2021

View reviewed changes

src/count-loc.ts Outdated Show resolved Hide resolved

src/analyze.ts Outdated Show resolved Hide resolved

aeisenberg force-pushed the aeisenberg/add-linguist-data branch from 674720b to 28900de Compare April 26, 2021 18:35

adityasharad reviewed Apr 26, 2021

View reviewed changes

src/analyze.test.ts Show resolved Hide resolved

aeisenberg force-pushed the aeisenberg/add-linguist-data branch from 28900de to 998f472 Compare April 26, 2021 21:09

adityasharad approved these changes Apr 26, 2021

View reviewed changes

aeisenberg merged commit 03f029c into main Apr 26, 2021

aeisenberg deleted the aeisenberg/add-linguist-data branch April 26, 2021 21:23

github-actions bot mentioned this pull request Apr 30, 2021

Merge main into v1 #471

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add baseline metrics for lines of code #459

Add baseline metrics for lines of code #459

aeisenberg commented Apr 22, 2021 •

edited

Loading

robertbrignull Apr 23, 2021

aeisenberg Apr 23, 2021

adityasharad left a comment

robertbrignull left a comment

Add baseline metrics for lines of code #459

Add baseline metrics for lines of code #459

Conversation

aeisenberg commented Apr 22, 2021 • edited Loading

Merge / deployment checklist

robertbrignull Apr 23, 2021

Choose a reason for hiding this comment

aeisenberg Apr 23, 2021

Choose a reason for hiding this comment

adityasharad left a comment

Choose a reason for hiding this comment

robertbrignull left a comment

Choose a reason for hiding this comment

aeisenberg commented Apr 22, 2021 •

edited

Loading