You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Partway through implementing #18, I noticed that it takes more than simply HTML to reconstitute the geometry of a page. That is, a mere headless browser does not suffice to deliver the additional signal we wanted. Let's choose a solution.
See if there are any off-the-shelf libs out there that "freeze-dry" a page, preserving node geometry without saving every resource in its entirety. [Nope, Swathi couldn't find any.]
The text was updated successfully, but these errors were encountered:
We're going to break product work into its own repo so we can bloat with abandon. We'll capture Safari webarchives of corpus pages. We'll extract what we care about from them into a mock datastore off to the side so we can run our training and tests in node.
Partway through implementing #18, I noticed that it takes more than simply HTML to reconstitute the geometry of a page. That is, a mere headless browser does not suffice to deliver the additional signal we wanted. Let's choose a solution.
See https://github.com/mozilla/fathom/wiki/PageGeometryCaptureSolutions for a more editable, history-tracking place to do it.
The text was updated successfully, but these errors were encountered: