-
Notifications
You must be signed in to change notification settings - Fork 15
Improve price string parsing when there is a single price on the page #79
Comments
Note: from the above URL, I click to a related item (a memory card) and click the dropdown. It's working fine. |
I copied the console from the Fuji page. I'm guessing it's just the first bit on ExtractedProductCard
|
I broke it again on this Gap.com page (Gap wasn't in our list of 5 top sites but I was trying different sites anyway...)
|
It also breaks on amazon.co.uk
|
The doorhanger is also broken on Macys.com (a supported site for MVP)
|
Thank you Javaun! The pages you reference actually point primarily at two related but separate issues:
Other pages you cited that don't fall into these categories:
I'm going to edit the title of this issue to be more descriptive of situation 1, now that the digging is done. |
After I renamed the issue, I realized this is a duplicate of an issue Osmose filed much earlier: #42 . Closing in favor of that one. |
* Update how we pull the price string from extracted Fathom price elements to provide main and subunits (e.g. dollars and cents) if available. * Added price string cleaning methods to remove extra characters (like commas) that were causing price parsing to fail. * Handle case when price string parsing still fails after cleaning by checking in the background script that the price string is formatted correctly before rendering the browserAction popup. * This will guarantee we never see the “blank panel” reported in #79/#88. Price element innerText strings now supported as a result of these changes: * "$1327 /each" ([Home Depot example page](https://www.homedepot.com/p/KitchenAid-Classic-4-5-Qt-Tilt-Head-White-Stand-Mixer-K45SSWH/202546032)) * "$1,049.00" ([Amazon example page](https://www.amazon.com/Fujifilm-X-T2-Mirrorless-F2-8-4-0-Lens/dp/B01I3LNQ6M/ref=sr_1_2?ie=UTF8&qid=1535594119&sr=8-2&keywords=fuji+xt2+camera)) * "US $789.99" ([Ebay example page](https://www.ebay.com/itm/Dell-Inspiron-7570-15-6-Touch-Laptop-i7-8550U-1-8GHz-8GB-1TB-NVIDIA-940MX-W10/263827294291)) * "$4.99+" ([Etsy example page](https://www.etsy.com/listing/555504975/frankenstein-2-custom-stencil?ga_order=most_relevant&ga_search_type=all&ga_view_type=gallery&ga_search_query=&ref=sr_gallery-1-13)) Note: This does not handle the case where there is more than one price for the product page (e.g. if we see a range of prices such as "$19.92 - $38.00" or if the price changes based on size/color, etc.); that’s handled by Issue #86.
* Update how we pull the price string from extracted Fathom price elements to provide main and subunits (e.g. dollars and cents) if available. * Add price string cleaning methods to remove extra characters (like commas) that were causing price parsing to fail. * Handle case when price string parsing still fails after cleaning by checking in the background script that the price string is formatted correctly before rendering the browserAction popup. * This will guarantee we never see the “blank panel” reported in #79 and #88. Price element innerText strings now supported as a result of these changes: * "$1327 /each" ([Home Depot example page](https://www.homedepot.com/p/KitchenAid-Classic-4-5-Qt-Tilt-Head-White-Stand-Mixer-K45SSWH/202546032)) * "$1,049.00" ([Amazon example page](https://www.amazon.com/Fujifilm-X-T2-Mirrorless-F2-8-4-0-Lens/dp/B01I3LNQ6M/ref=sr_1_2?ie=UTF8&qid=1535594119&sr=8-2&keywords=fuji+xt2+camera)) * "US $789.99" ([Ebay example page](https://www.ebay.com/itm/Dell-Inspiron-7570-15-6-Touch-Laptop-i7-8550U-1-8GHz-8GB-1TB-NVIDIA-940MX-W10/263827294291)) * "$4.99+" ([Etsy example page](https://www.etsy.com/listing/555504975/frankenstein-2-custom-stencil?ga_order=most_relevant&ga_search_type=all&ga_view_type=gallery&ga_search_query=&ref=sr_gallery-1-13)) Note: This does not handle the case where there is more than one price for the product page (e.g. if we see a range of prices such as "$19.92 - $38.00" or if the price changes based on size/color, etc.); that’s handled by Issue #86.
* Update how we pull the price string from extracted Fathom price elements to provide main and subunits (e.g. dollars and cents) if available. * Add price string cleaning methods to remove extra characters (like commas) that were causing price parsing to fail. * Handle case when price string parsing still fails after cleaning by checking in the background script that the price string is formatted correctly before rendering the browserAction popup. * This will guarantee we never see the “blank panel” reported in #79 and #88. Price element innerText strings now supported as a result of these changes: * "$1327 /each" ([Home Depot example page](https://www.homedepot.com/p/KitchenAid-Classic-4-5-Qt-Tilt-Head-White-Stand-Mixer-K45SSWH/202546032)) * "$1,049.00" ([Amazon example page](https://www.amazon.com/Fujifilm-X-T2-Mirrorless-F2-8-4-0-Lens/dp/B01I3LNQ6M/ref=sr_1_2?ie=UTF8&qid=1535594119&sr=8-2&keywords=fuji+xt2+camera)) * "US $789.99" ([Ebay example page](https://www.ebay.com/itm/Dell-Inspiron-7570-15-6-Touch-Laptop-i7-8550U-1-8GHz-8GB-1TB-NVIDIA-940MX-W10/263827294291)) * "$4.99+" ([Etsy example page](https://www.etsy.com/listing/555504975/frankenstein-2-custom-stencil?ga_order=most_relevant&ga_search_type=all&ga_view_type=gallery&ga_search_query=&ref=sr_gallery-1-13)) Note: This does not handle the case where there is more than one price for the product page (e.g. if we see a range of prices such as "$19.92 - $38.00" or if the price changes based on size/color, etc.); that’s handled by Issue #86.
* Update how we pull the price string from extracted Fathom price elements to provide main and subunits (e.g. dollars and cents) if available. * Add price string cleaning methods to remove extra characters (like commas) that were causing price parsing to fail. * Handle case when price string parsing still fails after cleaning by checking in the background script that the price string is formatted correctly before rendering the browserAction popup. * This will guarantee we never see the “blank panel” reported in #79 and #88. Price element innerText strings now supported as a result of these changes: * "$1327 /each" ([Home Depot example page](https://www.homedepot.com/p/KitchenAid-Classic-4-5-Qt-Tilt-Head-White-Stand-Mixer-K45SSWH/202546032)) * "$1,049.00" ([Amazon example page](https://www.amazon.com/Fujifilm-X-T2-Mirrorless-F2-8-4-0-Lens/dp/B01I3LNQ6M/ref=sr_1_2?ie=UTF8&qid=1535594119&sr=8-2&keywords=fuji+xt2+camera)) * "US $789.99" ([Ebay example page](https://www.ebay.com/itm/Dell-Inspiron-7570-15-6-Touch-Laptop-i7-8550U-1-8GHz-8GB-1TB-NVIDIA-940MX-W10/263827294291)) * "$4.99+" ([Etsy example page](https://www.etsy.com/listing/555504975/frankenstein-2-custom-stencil?ga_order=most_relevant&ga_search_type=all&ga_view_type=gallery&ga_search_query=&ref=sr_gallery-1-13)) Note: This does not handle the case where there is more than one price for the product page (e.g. if we see a range of prices such as "$19.92 - $38.00" or if the price changes based on size/color, etc.); that’s handled by Issue #86.
Edit (bdanforth): For a detailed explanation of the root causes of these failures, see this comment below.
TL;DR: These symptoms are primarily due to a brittle utility function that is used to reformat the price strings extracted from a product page that does not handle the case when parsing fails. Other symptoms covered by other issues include: how to handle pages with more than one price (#86 ) and failed extraction (#95 ).
See also #42.
Extension v0.1.0
Fuji XT2 Camera on Amazon:
STR:
URL
https://www.amazon.com/Fujifilm-X-T2-Mirrorless-F2-8-4-0-Lens/dp/B01I3LNQ6M/ref=sr_1_2?ie=UTF8&qid=1535594119&sr=8-2&keywords=fuji+xt2+camera
I'm guessing extraction failed?
Initial loading, I correctly see current items in the drop down
When page is done loading, I close/reopen panel and it's now blank
The text was updated successfully, but these errors were encountered: