Skip to content
This repository has been archived by the owner on Dec 3, 2020. It is now read-only.

Restrict product extraction to allowlisted domains #109

Closed
Osmose opened this issue Sep 11, 2018 · 2 comments
Closed

Restrict product extraction to allowlisted domains #109

Osmose opened this issue Sep 11, 2018 · 2 comments
Assignees
Milestone

Comments

@Osmose
Copy link
Contributor

Osmose commented Sep 11, 2018

@javaun @elancaster Given that we're now looking to ship via Test Pilot, do we still want to restrict the domains we attempt product extraction on? If not, we risk more incorrect extraction, but we give users the opportunity to report failure on domains that aren't in our original 5. We could also temporarily restrict for launch and open up after we have a better idea about how to handle ongoing development of our Fathom ruleset.

@javaun
Copy link

javaun commented Sep 17, 2018

When we start collecting opt-in data, one compelling reason to have it just run without a whitelist is that we'd collect data differently for shopping pages vs. other sites:

  1. If it's not a recognized shopping product page, just send the domain name. This is one way we'll discover which long-tail sites users shop.
  2. If Fathom does recognize a product page (whether it's an officially supported one, or Etsy, which isnt' supported but sometimes works), send the full URL.

That said, until we are ready to start collecting data, I think we could still restrict Fathom extraction to whitelisted domains. Fathom is new tech and there could easily be perf issues. If so, isolating it to just 5 sites may make it easier to troubleshoot. What do you think?

@Osmose
Copy link
Contributor Author

Osmose commented Sep 17, 2018

Sounds like "initially restrict, and revisit post-launch" is the way to go.

@Osmose Osmose self-assigned this Sep 26, 2018
@Osmose Osmose closed this as completed in 6cbb552 Oct 2, 2018
Osmose added a commit that referenced this issue Oct 2, 2018
Fix #109: Disable extraction outside of allowlisted sites.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants