Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide how to support page geometry etc. in tests and tuners #69

Closed
1 task done
erikrose opened this issue Jul 18, 2017 · 1 comment
Closed
1 task done

Decide how to support page geometry etc. in tests and tuners #69

erikrose opened this issue Jul 18, 2017 · 1 comment

Comments

@erikrose
Copy link
Contributor

erikrose commented Jul 18, 2017

Partway through implementing #18, I noticed that it takes more than simply HTML to reconstitute the geometry of a page. That is, a mere headless browser does not suffice to deliver the additional signal we wanted. Let's choose a solution.

See https://github.com/mozilla/fathom/wiki/PageGeometryCaptureSolutions for a more editable, history-tracking place to do it.

  • See if there are any off-the-shelf libs out there that "freeze-dry" a page, preserving node geometry without saving every resource in its entirety. [Nope, Swathi couldn't find any.]
@erikrose
Copy link
Contributor Author

We're going to break product work into its own repo so we can bloat with abandon. We'll capture Safari webarchives of corpus pages. We'll extract what we care about from them into a mock datastore off to the side so we can run our training and tests in node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant