-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle multiple languages for an extension #1
Conversation
Just in case you don't get tagged for PRs in this repository: @aeisenberg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for creating this PR! Just a handful of small nits. And a question.
src/languages.ts
Outdated
* where the extension is ambiguous. The ordering of the list matters and | ||
* languages earlier on will get a higher priority when resolving clashes. | ||
*/ | ||
const importantLanguages = ["javascript", "typescript", "ruby", "python", "java", "c", "c++", "c#", "rust", "scala", "perl", "go"]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you move this out to a top-level constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And just curious, can you give an example of where the ordering here is important?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only example I found where the ordering actually mattered was the case of .spec
that I described in the PR description as this extension appears in a few languages but (to the best of my understanding) is mostly important for Ruby where it is common in unit tests. I still ordered the other languages roughly by popularity in case any ambiguities arise in their extensions in future, but at the moment there are no other ambiguities among these languages (except .h
being shared between c
and c++
but in CodeQL we don't care about that because they both map to the same thing).
PR comments above addressed, thanks 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
I'll do a release shortly.
Addresses the underlying cause of github/codeql-action#584, though will not close it yet as we still have to bump the version of this that the CodeQL Action depends on and add a test there to avoid a regression.
The language map that we import unfortunately has some ambiguity in what language an extension maps to since some programming languages where extensions. Currently, we just use whatever appears later in the alphabet, which results in (for example), Smalltalk overriding C# which is clearly not desirable. This PR adds a simple heuristic of languages that are important enough to take priority over other ones. The list is roughly based on a list of most popular programming languages in order of popularity, with some manual adjustments to resolve clashes that in my opinion ended up being resolved the wrong way by doing this. For example, I moved Ruby quite high up because
.spec
appears a lot in Ruby tests but is also an extension in Python where it used much less.