Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NuGet.org Bug]: Non-increasing catalog leaf timestamps #10261

Open
joelverhagen opened this issue Nov 11, 2024 · 0 comments
Open

[NuGet.org Bug]: Non-increasing catalog leaf timestamps #10261

joelverhagen opened this issue Nov 11, 2024 · 0 comments

Comments

@joelverhagen
Copy link
Member

Impact

It bothers me. A fix would be nice

Describe the bug

The following pages have either non-increasing (duplicate) commit timestamps from one page to the next or decreasing. Catalog leaves are expecting to be monotonically increasing and this is not the case on some very old leaves.

NiCatalogLeafItems
| summarize min(CommitTimestamp), max(CommitTimestamp) by PageUrl
| order by min_CommitTimestamp asc
| parse PageUrl with "https://api.nuget.org/v3/catalog0/page" PageNumber : int ".json"
| extend UntilNext = next(min_CommitTimestamp) - max_CommitTimestamp
| where isnotempty(UntilNext) and UntilNext <= 0s
PageUrl min_CommitTimestamp max_CommitTimestamp PageNumber UntilNext
https://api.nuget.org/v3/catalog0/page1300.json 2016-01-13 18:32:59.2796915 2016-01-13 22:11:49.1579762 1300 -00:00:02.5247195
https://api.nuget.org/v3/catalog0/page1309.json 2016-01-15 01:37:48.4657215 2016-01-15 04:02:56.0470835 1309 00:00:00
https://api.nuget.org/v3/catalog0/page1436.json 2016-03-11 20:12:08.9916942 2016-03-12 07:07:06.5567940 1436 00:00:00

For the negative delta case, there appears to be a leaf that got written at the end of one page and at the beginning of the next. The leaf URL is different.
image

For the first 0 case, a batch of 2 items was written at the end of one page and again at the beginning of the next page (with an additional item upon retry)
image

Similar with the second 0 case:
image

This also means some leaf URLs appear in multiple pages, which is not expected:

NiCatalogLeafItems
| summarize count() by Url
| where count_ > 1
| distinct Url
| join kind=inner NiCatalogLeafItems on Url
| project-away Url1
| parse PageUrl with "https://api.nuget.org/v3/catalog0/page" PageNumber : int ".json"
| order by Url asc, PageNumber asc
| project Url, PageNumber
Url PageNumber
https://api.nuget.org/v3/catalog0/data/2016.01.15.04.02.56/babylonjs.typescript.definitelytyped.1.2.1.json 1309
https://api.nuget.org/v3/catalog0/data/2016.01.15.04.02.56/babylonjs.typescript.definitelytyped.1.2.1.json 1310
https://api.nuget.org/v3/catalog0/data/2016.01.15.04.02.56/backbone-relational.typescript.definitelytyped.1.0.7.json 1309
https://api.nuget.org/v3/catalog0/data/2016.01.15.04.02.56/backbone-relational.typescript.definitelytyped.1.0.7.json 1310
https://api.nuget.org/v3/catalog0/data/2016.03.12.07.07.06/awssdk.dynamodbv2.3.2.3-beta.json 1436
https://api.nuget.org/v3/catalog0/data/2016.03.12.07.07.06/awssdk.dynamodbv2.3.2.3-beta.json 1437

Repro Steps

Run the queries above against NuGet Insights.

Expected Behavior

Commit timestamps should be monotonically increasing.

A catalog leaf URLs should only appear in a single page.

The commit ID and commit timestamp on the page leaf item should match the full leaf document. This is not the case for the duplicate leaf URLs.

Screenshots

No response

Additional Context and logs

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants