Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resetting the list of orgs to include or exclude in the search for each image #326

Merged
merged 7 commits into from
Mar 10, 2022
Merged

Resetting the list of orgs to include or exclude in the search for each image #326

merged 7 commits into from
Mar 10, 2022

Conversation

avimanyum
Copy link
Contributor

While running the all subcommand, there is a Git search that happens for each image found in the tag store one after the other. While processing one of the images, if the hashmap orgsToIncludeInSearch is modified (this can happen one image returns more than 1000 search results. Refer to this for more details), the modified map persists and is carried over while performing the search for the next image in the tag store. This is a bug.
This PR resets the hashmap before performing the search for each image.


Map<String, Boolean> orgsToIncludeInSearch = new HashMap<>();
if (ns.get(Constants.GIT_ORG) != null) {
// If there is a Git org specified, that needs to be included in the search query. In
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line is longer than 100 characters (found 105).

Map<String, Boolean> orgsToIncludeInSearch = new HashMap<>();
if (ns.get(Constants.GIT_ORG) != null) {
// If there is a Git org specified, that needs to be included in the search query. In
// the orgsToIncludeInSearch a true value associated with an org name ensures that
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line is longer than 100 characters (found 102).

@codecov
Copy link

codecov bot commented Mar 8, 2022

Codecov Report

Merging #326 (9ea7bc8) into master (51c97da) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@             Coverage Diff              @@
##             master     #326      +/-   ##
============================================
+ Coverage     81.47%   81.49%   +0.01%     
  Complexity      263      263              
============================================
  Files            19       19              
  Lines           934      935       +1     
  Branches        126      126              
============================================
+ Hits            761      762       +1     
  Misses          142      142              
  Partials         31       31              
Impacted Files Coverage Δ
...kerfileimageupdate/utils/DockerfileGitHubUtil.java 85.45% <100.00%> (+0.06%) ⬆️

@avimanyum avimanyum added the bug Bug fixes label Mar 8, 2022
afalko
afalko previously approved these changes Mar 10, 2022
Copy link
Collaborator

@afalko afalko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It think it wants us to add a unit test for this case. If it is too difficult, I'm still OK merging.

@afalko afalko requested a review from cjneasbi March 10, 2022 16:58
@cjneasbi
Copy link
Contributor

So I think this change is solving the wrong problem. The problem is a method in the call chain is programming by side effect instead of just modifying its return value. If I understand correctly the map being cleared is passed here, to here, and then is inappropriately modified at here. Thats the actual bug, getSearchResultsExcludingOrgWithMostHits should not be modifying its input parameter to which is programming by side effect. Instead we should change the return value of getSearchResultsExcludingOrgWithMostHits to also return any additional information (like the updates to the input parameter) required by the callers.

@@ -59,7 +52,13 @@ public void execute(final Namespace ns, final DockerfileGitHubUtil dockerfileGit
GitForkBranch gitForkBranch = getGitForkBranch(image, tag, ns);

log.info("Finding Dockerfiles with the image name {}...", image);

Map<String, Boolean> orgsToIncludeInSearch = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments about why I think this is solving the wrong problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The map is being modified as expected. The modification of the map happens only when we exclude orgs recursively to reduce the search space to get around the limit that Git has of returning only 1000 search results.
For every image that is being searched for from the image tag store, the map needs to start as an empty map. Only if the search of that image returns more than 1000 results does the map get modified - which is what we want.
However, when that image is processed completely and the run proceeds to the next image, this map should again start as an empty map and not automatically exclude the orgs that were excluded during the search of some previous image from the tag store.

Map<String, Boolean> orgsToExcludeFromSearch = new HashMap<>();
orgsToExcludeFromSearch.putAll(orgsToExclude);
orgsToExcludeFromSearch.put(orgWithMaximumHits, false);
log.info("Running search by excluding the orgs {}.", orgsToExcludeFromSearch.keySet().toString());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line is longer than 100 characters (found 106).

Optional<List<PagedSearchIterable<GHContent>>> contentsExcludingOrgWithMaximumHits;
contentsExcludingOrgWithMaximumHits = findFilesWithImage(image, orgsToExclude, gitApiSearchLimit);
contentsExcludingOrgWithMaximumHits = findFilesWithImage(image, orgsToExcludeFromSearch, gitApiSearchLimit);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line is longer than 100 characters (found 116).

Map<String, Boolean> orgsToExcludeFromSearch = new HashMap<>();
orgsToExcludeFromSearch.putAll(orgsToExclude);
orgsToExcludeFromSearch.put(orgWithMaximumHits, false);
log.info("Running search by excluding the orgs {}.", orgsToExcludeFromSearch.keySet().toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

orgsToExclude.put(orgWithMaximumHits, false);
log.info("Running search by excluding the orgs {}.", orgsToExclude.keySet().toString());
Map<String, Boolean> orgsToExcludeFromSearch = new HashMap<>();
orgsToExcludeFromSearch.putAll(orgsToExclude);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need putAll here if you this constructor for HashMap, https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html#HashMap-java.util.Map-

@@ -177,10 +177,12 @@ protected String getOrgNameWithMaximumHits(PagedSearchIterable<GHContent> files)
Optional<List<PagedSearchIterable<GHContent>>> contentsForOrgWithMaximumHits;
contentsForOrgWithMaximumHits = findFilesWithImage(image, orgsToInclude, gitApiSearchLimit);

orgsToExclude.put(orgWithMaximumHits, false);
log.info("Running search by excluding the orgs {}.", orgsToExclude.keySet().toString());
Map<String, Boolean> orgsToExcludeFromSearch = new HashMap<>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Local variables should be marked as final if possible to prevent reference reassignment.

@github-actions github-actions bot added the tests Improvements / maintenance to tests label Mar 10, 2022
@@ -330,6 +330,7 @@ public void getSearchResultsExcludingOrgWithMostHits() throws Exception {
Map<String, Boolean> orgsToIncludeOrExclude = new HashMap<>();

assertEquals((dockerfileGitHubUtil.getSearchResultsExcludingOrgWithMostHits("image", contentsWithImage, orgsToIncludeOrExclude, 1000)).get().size(), 2);
assertEquals(orgsToIncludeOrExclude.size(), 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: We might want to add a small comment as to why we are checking for this condition.

@@ -177,10 +177,11 @@ protected String getOrgNameWithMaximumHits(PagedSearchIterable<GHContent> files)
Optional<List<PagedSearchIterable<GHContent>>> contentsForOrgWithMaximumHits;
contentsForOrgWithMaximumHits = findFilesWithImage(image, orgsToInclude, gitApiSearchLimit);

orgsToExclude.put(orgWithMaximumHits, false);
log.info("Running search by excluding the orgs {}.", orgsToExclude.keySet().toString());
final Map<String, Boolean> orgsToExcludeFromSearch = new HashMap(orgsToExclude);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You'll want to use new HashMap<>(orgsToExclude) otherwise you'll get type warnings during compilation.

@afalko afalko merged commit 859c50c into salesforce:master Mar 10, 2022
@justinharringa justinharringa removed the tests Improvements / maintenance to tests label Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants