Skip to content

Latest commit

 

History

History

ta11y-extract

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

ta11y Logo

@ta11y/extract

Extracts content from websites for running accessibility audits with ta11y.

NPM Build Status JavaScript Style Guide

Install

npm install --save @ta11y/extract

Usage

The easiest way to use this package is to use the CLI.

const { extract } = require('@ta11y/extract')

extract('https://en.wikipedia.org')
  .then((result) => {
    console.log(result.summary) // overview of results (number of urls visited, success, error)
    console.log(result.results) // detailed results keyed by url
  })
const { extract } = require('@ta11y/extract')

// example passing HTML directly
extract('<!doctype><html><body><h1>I ❤ accessibility</h1></body></html>')
  .then((result) => {
    console.log(result.summary) // overview of results (number of urls visited, success, error)
    console.log(result.results) // detailed results keyed by url

    // note that the result key for an HTML input is 'root' instead of url
  })

API

Extracts the dynamic HTML content from a website, optionally crawling the site to discover additional pages and extracting those too.

Type: function (urlOrHtml, opts): Promise

  • urlOrHtml string URL or raw HTML to process.
  • opts object Config options.
    • opts.browser object Required Puppeteer browser instance to use.
    • opts.crawl boolean Whether or not to crawl additional pages. (optional, default false)
    • opts.maxDepth number Maximum crawl depth while crawling. (optional, default 16)
    • opts.maxVisit number? Maximum number of pages to visit while crawling.
    • opts.sameOrigin boolean Whether or not to only consider crawling links with the same origin as the root URL. (optional, default true)
    • opts.blacklist Array<string>? Optional blacklist of URL glob patterns to ignore.
    • opts.whitelist Array<string>? Optional whitelist of URL glob patterns to only include.
    • opts.gotoOptions object? Customize the Page.goto navigation options.
    • opts.viewport object? Set the browser window's viewport dimensions and/or resolution.
    • opts.userAgent string? Set the browser's user-agent.
    • opts.emulateDevice string? Emulate a specific device type.- Use the name property from one of the built-in devices.
      • Overrides viewport and userAgent.
    • opts.onNewPage function? Optional async function called every time a new page is initialized before proceeding with extraction.

License

MIT © Saasify