Skip to content
This repository has been archived by the owner on Dec 3, 2020. It is now read-only.

Update fallback extraction CSS selector data #84

Closed
biancadanforth opened this issue Aug 30, 2018 · 0 comments
Closed

Update fallback extraction CSS selector data #84

biancadanforth opened this issue Aug 30, 2018 · 0 comments
Assignees
Milestone

Comments

@biancadanforth
Copy link
Collaborator

Our product_extraction_data.json is outdated. We should update this file so that it has CSS selectors for all five of our top sites (Home Depot and Best Buy are currently missing) at a minimum in the case that Fathom extraction fails.

@biancadanforth biancadanforth added this to the November MVP milestone Aug 30, 2018
@biancadanforth biancadanforth self-assigned this Sep 12, 2018
biancadanforth added a commit that referenced this issue Sep 14, 2018
Improve fallback extraction by CSS selectors:
* Update selectors for the top 5 sites. Add Home Depot and Best Buy.
* Rename selectors JSON file to be more descriptive (was 'product_extraction_data.json', now 'fallback_extraction_selectors.json').
* Represent supported sites in 'fallback_extraction_selectors.json' as regular expression strings so that fallback extraction works for any subdomain of the site (e.g. 'smile.amazon.com').
* Represent CSS selectors by tuples in 'fallback_extraction_selectors.json', so that each selector can specify which attribute or property to read for that selector.
* Clean price strings from fallback extraction using the same methods as used by Fathom extraction (PR #111); consolidate and move shared methods to 'utils.js'.
biancadanforth added a commit that referenced this issue Sep 21, 2018
Improve fallback extraction by CSS selectors:
* Update selectors for the top 5 sites. Add Home Depot and Best Buy.
* Rename selectors JSON file to be more descriptive (was 'product_extraction_data.json', now 'fallback_extraction_selectors.json').
* Represent supported sites in 'fallback_extraction_selectors.json' as regular expression strings so that fallback extraction works for any subdomain of the site (e.g. 'smile.amazon.com').
* Represent CSS selectors by tuples in 'fallback_extraction_selectors.json', so that each selector can specify which attribute or property to read for that selector.
* Clean price strings from fallback extraction using the same methods as used by Fathom extraction (PR #111); consolidate and move shared methods to 'utils.js'.
biancadanforth added a commit that referenced this issue Sep 21, 2018
Improve fallback extraction by CSS selectors:
* Update selectors for the top 5 sites. Add Home Depot and Best Buy.
* Rename selectors JSON file to be more descriptive (was 'product_extraction_data.json', now 'fallback_extraction_selectors.json').
* Represent supported sites in 'fallback_extraction_selectors.json' as regular expression strings so that fallback extraction works for any subdomain of the site (e.g. 'smile.amazon.com').
* Represent CSS selectors by tuples in 'fallback_extraction_selectors.json', so that each selector can specify which attribute or property to read for that selector.
* Clean price strings from fallback extraction using the same methods as used by Fathom extraction (PR #111); consolidate and move shared methods to 'utils.js'.
biancadanforth added a commit that referenced this issue Sep 28, 2018
Improve fallback extraction by CSS selectors:
* Update selectors for the top 5 sites. Add Home Depot and Best Buy.
* Rename selectors JSON file to be more descriptive (was 'product_extraction_data.json', now 'fallback_extraction_selectors.json').
* Represent supported sites in 'fallback_extraction_selectors.json' as regular expression strings so that fallback extraction works for any subdomain of the site (e.g. 'smile.amazon.com').
* Represent CSS selectors by tuples in 'fallback_extraction_selectors.json', so that each selector can specify which attribute or property to read for that selector.
* Clean price strings from fallback extraction using the same methods as used by Fathom extraction (PR #111); consolidate and move shared methods to 'utils.js'.
biancadanforth added a commit that referenced this issue Sep 28, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant