Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleans test manifest from absent files. #6311

Merged
merged 1 commit into from
Jan 14, 2016

Conversation

yurydelendik
Copy link
Contributor

Adds "disabled" property for the test with missing linked files. To enable these tests (e.g. at botio) --allowDisabled parameters shall be used with test.js.

@@ -928,6 +929,7 @@
"md5": "2ade57e954ae7632749cf328daeaa7a8",
"rounds": 1,
"link": true,
"disabled": true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rob--W
Copy link
Member

Rob--W commented Aug 8, 2015

What's the purpose of this patch?

@@ -2267,6 +2277,7 @@
"link": true,
"firstPage": 9,
"lastPage": 9,
"disabled": true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If PR #6434 is accepted, then this line can be removed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#6434 has been merged, so this line can be removed.

@yurydelendik yurydelendik force-pushed the clean_test_manifest branch 4 times, most recently from 647fa21 to 0ec0de6 Compare October 30, 2015 18:34
@yurydelendik
Copy link
Contributor Author

What's the purpose of this patch?

@Rob--W , with current state of the tree the contributors efficiently cannot run tests (or tell if their change breaks stuff). This patch will clean up the manifest from links of disappeared from the internet files.

@Snuffleupagus and @brendandahl , are you okay with merging it?

@@ -1807,7 +1813,7 @@
"link": false,
"type": "load",
"password": "\u00E6\u00F8\u00E5",
"about": "The password (���) is UTF8 encoded."
"about": "The password (æøå) is UTF8 encoded."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly changed in this line? I can't see anything obvious in the diff. It is some encoding thing?

@timvandermeij
Copy link
Contributor

I see more hash changes than there are new URLs. Where do those come from exactly? Although this patch is useful, it would of course be better to replace all disabled items with either linked (from the Web Archive) or reduced test cases. @Snuffleupagus already did this for quite some tests, which was awesome, so that has my preference, but I also understand that being able to run all tests without missing PDF problems is important, so I'm fine with either way.

@Snuffleupagus
Copy link
Collaborator

I see more hash changes than there are new URLs.

I'm wondering/worrying about all of these hash changes too!
@yurydelendik Did you in every case manually confirm that the new version of a PDF file demonstrates the exact same issue as the old one, since otherwise the test might not actually test what you'd expect?

Edit: So far, I've only checked the first case of changed hashes, and there the new file version wasn't a good replacement for the old file. So it seems to me that all of the changed hashes need careful verification on a case-by-case basis.

@@ -68,7 +68,7 @@
},
{ "id": "issue3879",
"file": "pdfs/issue3879.pdf",
"md5": "1cd1f1c3271515a14e65ff2980e14663",
"md5": "85393be14a304d55f516918a4df889e3",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new version of this file does not contain the same deficiency as the old one!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With PR #6585, this change shouldn't be necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That PR has been merged, so this change can be removed.

@Rob--W
Copy link
Member

Rob--W commented Nov 1, 2015

Instead of linking to external files and hoping that they stick around, what do you think of creating a (git?) repository to host these test cases?

I have most of the PDFs locally (the oldest batch was retrieved on the 30th of May (2013)) in case you need it.

@timvandermeij
Copy link
Contributor

@Snuffleupagus That's unfortunate, but I was also a bit afraid of that. Sometimes it takes a lot of time and effort to verify if a new file has the same issue, so perhaps we should do this in separate PRs (just like you always do) so we can test them on a case-by-case basis.

@Rob--W I believe there was a problem with redistribution. I'm all for such a Git repository, but I think that we are only allowed to link to the original PDF files if we have no permission to redistribute them. Hence I tihnk that we should replace those linked test cases with reduced test cases that we can put in our repository so we do not have that problem. Anyone who knows how to create reduced test cases (or perhaps unit tests that demonstrate the original problem of a linked PDF file) is very welcome to contribute those. I'm very happy that @Snuffleupagus is already doing such a great job with that. Maybe we should first focus on the disabled ones in this PR and try to replace them all (maybe even with a linked test case as anything is better than an unavailable test case).

@@ -2470,7 +2479,7 @@
},
{ "id": "issue4926",
"file": "pdfs/issue4926.pdf",
"md5": "ed881c8ea2f9bc4be94ecb7f2b2c149b",
"md5": "e0c36aaee776eea59a3b0cdbe27bd27d",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though this version of the PDF file seems to exhibit the same issue as the previous one, I'd still suggest updating the link to an older version instead:
http://web.archive.org/web/20140802003728/http://www.conservation.ca.gov/cgs/information/publications/cgs_notes/note_17/Documents/note_17.pdf.

My reasoning is that the old version is half the size of the new one, and it seems unnecessary to require contributors to download bigger files than strictly required.

@yurydelendik
Copy link
Contributor Author

what do you think of creating a (git?) repository to host these test cases

@Rob--W As it was said above, we cannot redistribute third-party PDFs unless we will have permissions to do so. We actively request minimal test case PDFs and using linked PDFs as a last resort.

I'm thinking we are at the point when we can just discard all linked PDFs testing, however there is a tiny chance that we might regress something at fonts. I'm using this PR to:

  • help beginners to setup efficient testing environment
  • use it as a guide to determine which linked PDF must be replaced sooner
  • proposing intermediate solution, when we still testing linked PDFs on the servers, but contributors using smaller subset (later we can discard all linked PDF if we determine to do so)

@Snuffleupagus
Copy link
Collaborator

I'm thinking we are at the point when we can just discard all linked PDFs testing, however there is a tiny chance that we might regress something at fonts.

Unfortunately I'd be quite surprised if we could actually do this, without also losing test-coverage for a significant part of the code-base.

Based on my experience creating reduced test-cases, the real issue with removing linked tests probably isn't going to be with fonts, but rather with other parts of the code-base, e.g. the image parsing.
For fonts, you can always replace a linked test-case with a reduced one (even if it can take a fair amount of time/effort to do so). However, for test-cases related to e.g. image parsing (or certain types of PDF file corruption), it can be either very difficult or next to impossible to create reduced test-cases manually.

_Edit:_ I will, time permitting, continue to try and replace linked test-cases with reduced ones. But obviously I cannot promise neither a time-frame for this work, nor that I'll be able to successfully do so in all cases.

@@ -164,11 +164,12 @@
"link": true,
"lastPage": 2,
"rounds": 1,
"disabled": true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With PR #6610, this change should be reverted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That PR has been merged, so this change can be removed.

@yurydelendik yurydelendik force-pushed the clean_test_manifest branch 2 times, most recently from b6d293a to d75a3f8 Compare November 16, 2015 18:18
@@ -1728,6 +1729,7 @@
"md5": "89ddf9e63cac4fa2dedcfe32a62e5407",
"rounds": 1,
"link": true,
"disabled": true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change can be reverted, thanks to PR #6647.

@Snuffleupagus
Copy link
Collaborator

r=me, with the remaining comments resolved and the test command updated on the bots.

@timvandermeij
Copy link
Contributor

Is there really nothing more we can do here except for disabling some test cases? I was hoping we could just replace the dead test cases with reduced ones or new links so we would not need to disable them altogether. Perhaps we can find out when some broken links were added and contact the owner of the PDF files on the original issues to see if we can get a new link/permission to add it to the test suite?

@@ -2431,6 +2433,7 @@
"link": true,
"firstPage": 1,
"lastPage": 2,
"disabled": true,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change can be reverted once PR #6853 lands.

@@ -1941,7 +1941,7 @@
"link": false,
"type": "load",
"password": "\u00E6\u00F8\u00E5",
"about": "The password (���) is UTF8 encoded."
"about": "The password (æøå) is UTF8 encoded."
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now it's proper UTF-8 codes, not sure what encoding was used before.

@timvandermeij
Copy link
Contributor

/botio test

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.22.172.223:8877/77b08f895b66869/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/95b5d5347894019/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Success

Full output at http://107.22.172.223:8877/77b08f895b66869/output.txt

Total script time: 20.77 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Linux)


Failed

Full output at http://107.21.233.14:8877/95b5d5347894019/output.txt

Total script time: 60.00 mins

@timvandermeij
Copy link
Contributor

/botio-linux test

@pdfjsbot
Copy link

From: Bot.io (Linux)


Received

Command cmd_test from @timvandermeij received. Current queue size: 0

Live output at: http://107.21.233.14:8877/df5e5164afaf7ee/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux)


Success

Full output at http://107.21.233.14:8877/df5e5164afaf7ee/output.txt

Total script time: 21.49 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

timvandermeij added a commit that referenced this pull request Jan 14, 2016
Cleans test manifest from absent files.
@timvandermeij timvandermeij merged commit 8318c82 into mozilla:master Jan 14, 2016
@timvandermeij
Copy link
Contributor

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants