Skip to content
This repository has been archived by the owner on Dec 3, 2020. It is now read-only.

Commit

Permalink
Break out fathom_ruleset.js into two separate scripts.
Browse files Browse the repository at this point in the history
The first script, 'ruleset_factory.js', exports a class to create a ruleset based on a set of coefficients; instances of this class are used in production (via 'fathom_extraction.js') and for Fathom training (via 'trainees.js').
2. The second script, 'trainees.js', is used exclusively for training using the FathomFox web extension and does not ship with the commerce web extension.

Additional changes and notes:
* I chose not to make use of the 'autobind' decorator in 'ruleset_factory.js', since it is also used in the training add-on, where devDeps like 'babel-core' and 'babel-plugin-transform-decorators-legacy' do not exist.
* I also turned off an eslint rule that requires class methods to use 'this', since some methods in RulesetFactory don't require it, and it would be tedious and confusing to call some methods on the class instance and others on the class itself.
* The new training script ('trainees.js') has three elements in the map it exports, one for each product feature ('image', 'title', 'price'). This allows us to select which feature to train from a dropdown menu on FathomFox's trainer page.
* Currently, for training, four files must be copied over into the 'fathom-trainees' add-on src directory:
  * config.js
  * fathom_default_coefficients.json
  * ruleset_factory.js
  * trainees.js (overwritting the existing file)
* In a separate commit, I will put all the Fathom extraction files into an 'extraction' (or similar) subfolder.
  • Loading branch information
biancadanforth committed Aug 19, 2018
1 parent 8b306ef commit f3c543a
Show file tree
Hide file tree
Showing 6 changed files with 424 additions and 377 deletions.
1 change: 1 addition & 0 deletions .eslintrc.json
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@
"prefer-destructuring": ["off"],
"no-restricted-syntax": ["off"],
"no-use-before-define": ["error", {"functions": false}],
"class-methods-use-this": ["off"],

"react/jsx-one-expression-per-line": ["off"],
"react/prefer-stateless-function": ["off"],
Expand Down
File renamed without changes.
37 changes: 9 additions & 28 deletions src/fathom_extraction.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,45 +10,26 @@
* Features: title, image, price
*/

import productRuleset from 'commerce/fathom_ruleset';
import {
largerImageCoeff,
largerFontSizeCoeff,
hasDollarSignCoeff,
hasPriceInIDCoeff,
hasPriceInClassNameCoeff,
isAboveTheFoldPriceCoeff,
isAboveTheFoldImageCoeff,
isNearbyImageXAxisPriceCoeff,
isNearbyImageYAxisTitleCoeff,
hasPriceishPatternCoeff,
} from 'commerce/fathom_coefficients.json';
import defaultCoefficients from 'commerce/fathom_default_coefficients.json';
import RulesetFactory from 'commerce/ruleset_factory';
import {SCORE_THRESHOLD} from 'commerce/config';

const PRODUCT_FEATURES = ['title', 'price', 'image'];
const {rulesetMaker} = productRuleset.get('product');
const rulesetWithCoeffs = rulesetMaker([
largerImageCoeff,
largerFontSizeCoeff,
hasDollarSignCoeff,
hasPriceInIDCoeff,
hasPriceInClassNameCoeff,
isAboveTheFoldPriceCoeff,
isAboveTheFoldImageCoeff,
isNearbyImageXAxisPriceCoeff,
isNearbyImageYAxisTitleCoeff,
hasPriceishPatternCoeff,
]);
// Array of numbers corresponding to the coefficients
const coefficients = Object.values(defaultCoefficients);
// For production, we don't need to generate a new ruleset factory
// and ruleset every time we run Fathom, since the coefficients are static.
const rulesetFactory = new RulesetFactory(coefficients);
const rules = rulesetFactory.makeRuleset();

/**
* Extracts the highest scoring element above a score threshold
* contained in a page's HTML document.
*/
function runRuleset(doc) {
const rulesetOutput = rulesetWithCoeffs.against(doc);
const extractedElements = {};
for (const feature of PRODUCT_FEATURES) {
let fnodesList = rulesetOutput.get(feature);
let fnodesList = rules.against(doc).get(feature);
fnodesList = fnodesList.filter(fnode => fnode.scoreFor(`${feature}ish`) >= SCORE_THRESHOLD);
// It is possible for multiple elements to have the same highest score.
if (fnodesList.length >= 1) {
Expand Down
Loading

0 comments on commit f3c543a

Please sign in to comment.