-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embedded license file ingestion for Gallery #6580
Conversation
I haven't seen the latest revision.
@@ -106,6 +106,8 @@ public static class GalleryConstants | |||
|
|||
public const string GitRepository = "git"; | |||
|
|||
public const string LicenseDeprecationUrl = "https://aka.ms/deprecateLicenseUrl"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://aka.ms/deprecateLicenseUrl [](start = 53, length = 34)
FYI, this link redirects to nuget's documentation home page. Is that intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to remember to update it to something useful when the doc becomes available.
src/NuGetGallery/Services/LicenseUrlDeprecationValidationMessage.cs
Outdated
Show resolved
Hide resolved
src/NuGetGallery/Services/LicenseUrlDeprecationValidationMessage.cs
Outdated
Show resolved
Hide resolved
} | ||
|
||
return null; | ||
} | ||
|
||
private static async Task<bool> IsStreamLengthMatchesReportedAsync(Stream licenseFileStream, long reportedLength) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsStreamLengthMatchesReportedAsync [](start = 40, length = 34)
This could be a nice Stream
extension method. Maybe you could make a Stream
extension class that also has LooksLikeUtf8TextStreamAsync
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like to have a Stream
extension methods that read from it. It should be very obvious that this operation consumes Stream
(which is important for streams that are not seekable). I don't think extensions methods make it obvious.
throw new InvalidOperationException($"Unsupported license metadata type: {licenseMetadata.Type}"); | ||
} | ||
|
||
private static bool HasChildElements(XElement xElement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HasChildElements [](start = 28, length = 16)
What do you think of moving HasChildElements
, GetLicenseVersion
and GetLicenseType
into LicenseCheckingNuspecReader
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO LicenseCheckingNuspecReader
should do all XML-based operations itself and completely abstract away the fact the XML-shape of the nuspec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I specifically moved them out not to call LicenseCheckingNuspecReader.LicenseElement
repeatedly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally, you could move all the functions into LicenseCheckingNuspecReader
such that you can make LicenseElement
private.
private class LicenseCheckingNuspecReader : NuspecReader
{
private XElement _licenseElement => MetadataNode.Element(MetadataNode.Name.Namespace + LicenseNodeName);
public string GetLicenseVersion() => _licenseElement.GetOptionalAttributeValue("version");
/* other functions here */
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what is the cost of calling MetadataNode.Element(MetadataNode.Name.Namespace + LicenseNodeName)
multiple times and I don't want to care about it: I call it once and reuse the result as much as possible. If I make them members of LicenseCheckingNuspecReader
then I'd have to make that Element
call in each method or do something atrocious to keep the value between calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also don't want LicenseCheckingNuspecReader
to exist, since it's only purpose is to get access to one protected member of a base class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to do that and rolled back the change. It looks ugly.
It makes LicenseCheckingNuspecReader
to provide methods that check the content of the license element in various ways (a methods to check if it exists, a method to check its length, a method to check if there are child elements, etc.). Class methods cannot assume licenseElement is not null, so this check creeps into every method whereas in the original code its a single check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO you should split a lot of this code into separate files. Otherwise looks great.
@@ -106,6 +106,8 @@ public static class GalleryConstants | |||
|
|||
public const string GitRepository = "git"; | |||
|
|||
public const string LicenseDeprecationUrl = "https://aka.ms/deprecateLicenseUrl"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to remember to update it to something useful when the doc becomes available.
throw new InvalidOperationException($"Unsupported license metadata type: {licenseMetadata.Type}"); | ||
} | ||
|
||
private static bool HasChildElements(XElement xElement) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO LicenseCheckingNuspecReader
should do all XML-based operations itself and completely abstract away the fact the XML-shape of the nuspec.
=> licenseElement | ||
.GetOptionalAttributeValue("type"); | ||
|
||
private class LicenseCheckingNuspecReader : NuspecReader |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to its own file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? It is only useful in the context of validation, I don't want to make it visible to all the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file is huge. Anything you can do to split it up into more logical parts would improve the readability of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The file is under 1000 lines. I've see methods longer than that :)
On a more serious note: moving it into separate file and making it public as a result would require to add a lot of code to be more defensive about how its methods are called. That will bloat its code which will not help the readability.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created work item to refactor the PackageUploadService
, we'll address these issues when we'd be working on it.
{ | ||
// if the package is immediately made available, it means there is a high chance we don't have | ||
// validation pipeline that would normally store the license file, so we'll do it ourselves here. | ||
await SavePackageLicenseFile(packageFile, licenseStream => _packageFileService.SaveLicenseFileAsync(package, licenseStream)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of making the method static
? IMO the lambda
here just makes the code more confusing--you're never going to be making this function do anything else other than save the file, so you've created a method named SavePackageLicenseFile
that needs to be told how to save the package license file. You should just pass in the Package
and call SaveLicenseFileAsync
directly.
One other option: if the stream returned by packageArchiveReader.GetEntry
is still valid after packageArchiveReader
is disposed, you could rename SavePackageLicenseFile
to GetPackageLicenseFile
and then return the stream so you don't have to pass the Package
to the function.
Enabling accepting license metadata. License files are not yet allowed.
Addresses #6545, #6529, #6516, #6532
New column was added to GalleryDB to contain the type of the embedded license file (plain text or markdown). Storage is implemented for the case when no async validation pipeline is present.
I ended up diverging from the spec which prescribed storing a presence of an embedded file as a bool flag because this way for the local storage case we lose the document type information. So an enum was introduced that persists that information in DB.
Warning when package does not contain <license> but specifies <licenseUrl>:
Warning when package does not contain any license information:
License URL rejection message (when enabled):
License file rejection message:
Warning when license URL is not aligned with license expression: