Skip to content

Latest commit

 

History

History
891 lines (552 loc) · 30.4 KB

CHANGELOG.md

File metadata and controls

891 lines (552 loc) · 30.4 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

[0.37.0] - 2024-12-06

Added

Fixed

  • Fix bug propagating identity encoder in raw_html/2 - thanks @andyleclair.

Removed

  • Remove support for Elixir 1.13 and OTP 22.

0.36.3 - 2024-10-21

This release contains some performance improvements, thanks to @ypconstante.

Fixed

  • Stop Floki.get_by_id/2 traversal on first match. Thanks @ypconstante.

  • Remove extra whitespaces from nodes without attributes on Floki.raw_html/1. Thank you @ypconstante.

  • Fix Floki.raw_html/1 typespecs. Thanks @davydog187.

0.36.2 - 2024-04-26

Added

  • Implement the Inspect protocol for the Floki.HTMLTree struct. This struct is currently private. Thank you @vittoriabitton.

Fixed

  • Fix regression to respect config option :encode in Floki.raw_html/2. Thanks @Sgoettschkes.

  • Make the Floki.raw_html/2 treat the contents of the <title> tag as plain text. The idea is to align with parse_document/2. Thank you @aymanosman.

0.36.1 - 2024-03-18

Fixed

  • Fix typespec of get_by_id/2.

0.36.0 - 2024-03-01

Added

  • Add Floki.get_by_id/1 that returns one element by ID or nil. Thanks @SteffenDE.

Changed

  • Improve options validation with Keyword.validate!/2. This is not a change in APIs, but the error messages and opts validation should be standardized now. Thanks @vittoriabitton.

Removed

  • Drop support for Elixir v1.12.

0.35.4 - 2024-02-19

Besides the fix described below, this release also contains more performance improvements, thanks to @ypconstante.

Fixed

  • Fix order of results for Floki.find/2. This was a regression from the previous version - thanks @ypconstante.

0.35.3 - 2024-01-25

This release has great performance improvements, thanks to the PRs from @ypconstante!

Most of the main functions, such as Floki.raw_html/2 and Floki.find/2 are faster and are using less memory. It's something like twice as fast, and half usage of memory for find/2, for example.

Fixed

  • Add :leex to Mix compilers. Fixes the build when running with dev version of Elixir. Thanks @wojtekmach.

  • Fix Floki.raw_html/2 when a tree using attributes as maps is given. Thanks @SupaMic.

  • Add a guard to Floki.find/2 so people can have a better error message when an invalid input is given. Thanks @Hajto.

  • Fix parsers to consider IO data as inputs. This may change in the next version of Floki, as I plan to drop support for IO data. Thanks @ypconstante.

Removed

  • Remove outdated Gleam wrapper code. The external functions syntax in Gleam has changed. So now the wrapper is not needed anymore. Thanks @michallepicki.

0.35.2 - 2023-10-25

Fixed

  • Enable usage of IO data by removing a guard for binaries in the main parser module.

0.35.1 - 2023-10-16

Fixed

  • Fix a small regression of the mochiweb parser that was breaking when a malformed HTML was used. For more details, see the original issue: #492

0.35.0 - 2023-10-13

Added

  • Add support for parsing attributes as maps.

    This makes parse_document/2 and parse_fragment/2 accept the option :attributes_as_maps to change the behaviour and return attributes as maps instead of lists of tuples. The only parser that does not support it yet is the fast_html.

Changed

  • Drop support for Elixir v1.11.

  • Change the log level of parsing logger calls from "info" to "debug". This will help to reduce the amount of noise in production apps.

0.34.3 - 2023-06-02

Added

  • Add boolean option :include_inputs to Floki.text/2 that changes the result of this function to include the values of inputs. So if there is any input with a "value" attribute, we now include that value if this option is set to true. Thanks @viniciusmuller.

Fixed

  • Fix find of elements by classes that contain colons. This is useful for when people are trying to find elements that contain Tailwind classes. Thanks @viniciusmuller.

  • Fix some typespecs that were using types from private modules. This is a fix to the documentation.

0.34.2 - 2023-02-24

Added

  • Add option to pass down arguments to the parser in Floki.parse_document/2 and Floki.parse_fragment/2. Thanks @Kuret.

  • Add support for returning more elements from the Floki.traverse_and_update/2 function callback. This enables the creation of more elements in the tree, but should be used with care, since the tree can grow a lot if the change is not controlled. Thanks @martosaur.

0.34.1 - 2023-02-11

Fixed

  • Fix pseudo-class ":not" selector parsing halting point. This is a fix for when a "pseudo-class" ":not" that contains an attribute selector is followed by another selector. This is an example: "a:not([class]), div".

  • Ignore decimal numeric char ref when number is negative.

0.34.0 - 2022-11-03

Added

  • User configurable "self-closing" tags. Now it's possible to define which tags are considered "self-closing". Thanks @inoas.

Fixed

  • Allow attribute values to not be escaped. This fixes Floki.raw_html/2 when used with the option encode: false. Thanks @juanazam.
  • Fix traverse_and_update/3 spec. Thanks @WLSF.

Changed

  • Drop support for Elixir 1.9 and 1.10.
  • Remove html_entities dependency. We now use an internal encoder/decoder for entities.
  • Change the main branch name to main.

0.33.1 - 2022-06-28

Fixed

  • Remove some warnings for unused code.

0.33.0 - 2022-06-28

Added

  • Add support for searching elements that contains text in a case-insensitive manner with fl-icontains - thanks @nuno84

Changed

  • Drop support for Elixir 1.8 and 1.9.
  • Fix and improve internal things - thanks @derek-zhou and @hissssst

0.32.1 - 2022-03-24

Fixed

  • Allow root nodes to be selected using pseudo-classes - thanks @rzane

0.32.0 - 2021-10-18

Added

  • Add an HTML tokenizer written in Elixir - this still experimental and it's not stable API yet.
  • Add support for HTML IDs containing periods in the selectors - thanks @Hugo-Hache
  • Add support for case-insensitive CSS attribute selectors - thanks @fcapovilla
  • Add the :root pseudo-class selector - thanks @fcapovilla

0.31.0 - 2021-06-11

Changed

  • Treat style and title tags as plaintext in Mochiweb - thanks @SweetMNM

0.30.1 - 2021-03-29

Fixed

  • Fix typespecs of Floki.traverse_and_update/2 to make clear that it does not accept text nodes directly.

0.30.0 - 2021-02-06

Added

  • Add ":disabled" pseudo selector - thanks @vnegrisolo
  • Add Gleam adapter - thanks @CrowdHailer
  • Add pretty option to Floki.raw_html/2 - thanks @evaldobratti
  • Add html_parser option to parse_ functions. This enables a more dynamic and functional configuration of the HTML parser in use.

Changed

  • Remove support for Elixir 1.7 - thanks @carlosfrodrigues
  • Replace IO.warn by Logger.info for deprecation warnings - thanks @juulSme

Fixed

  • Fix typespecs for find, attr and attribute functions - thanks @mtarnovan
  • Documentation Improvements - thanks @kianmeng

0.29.0 - 2020-10-02

Added

  • Add Floki.find_and_update/3 that updates nodes inside a tree, like traverse and update but without allowing changes in the children nodes. There for the tree cannot grow in size, but can have nodes removed.

Changed

  • Deprecate Floki.map/2 because we have now Floki.find_and_update/3 and Floki.traverse_and_update/2 that are powerful APIs. Floki.map/2 can be replaced by Enum.map/2 as well - thanks @josevalim for the idea!
  • Update optional dependency fast_html to v2.0.4

Fixed

  • Fix a bug when parsing a HTML with a XML inside using Mochiweb's parser

Improvements

  • Add more typespecs

0.28.0 - 2020-08-26

Added

  • Add support for :checked pseudo-class selector - thanks @wojtekmach

Changed

  • Drop support for Elixir 1.6
  • Update version of fast_html to 2.0 in docs and CI - thanks @rinpatch

Fixed

  • Fix docs by mentioning HTML nodes supported for traverse_and_update - thanks @hubertlepicki

0.27.0 - 2020-07-07

Added

  • Floki.filter_out/2 now can filter text nodes - thanks @ckruse
  • Support more encoding entities in Floki.raw_html/1 - thanks @ntenczar

Fixed

  • Fix Floki.attribute/2 when there is only text nodes in the document - thanks @ckruse

Improvements

  • Performance improvements of Floki.raw_html/1 function - thanks @josevalim
  • Improvements in the docs and specs of Floki.traverse_and_update/2 and Floki.children/1 - thanks @josevalim
  • Improvements in the spec of Floki.traverse_and_update/2 - thanks @Dalgona
  • Improve the CI setup to run the formatter correctly - thanks @Cleidiano

0.26.0 - 2020-02-17

Added

  • Add support for the pseudo-class selectors :nth-last-child and :nth-last-of-type

Fixed

  • Fix the typespecs of Floki.traverse_and_update/3 - thanks @RichMorin

Changed

  • Update optional dependency fast_html to v1.0.3

0.25.0 - 2020-01-26

Added

  • Add Floki.parse_fragment!/1 and Floki.parse_document!/1 that has the same functionality of the functions without the bang, but they return the document or fragment without the either tuple and will raise exception in case of errors - thanks @schneiderderek
  • Add Floki.traverse_and_update/3 which accepts an accumulator which is useful to keep the state while traversing the HTML tree - thanks @Dalgona

Changed

  • Update the html_entities dependency from v0.5.0 to v0.5.1

0.24.0 - 2020-01-01

Added

  • Add support for fast_html, which is a "C Node" wrapping Lexborisov's myhtml - thanks @rinpatch
  • Add setup to run our test suite against all parsers on CI - thanks @rinpatch
  • Add Floki.parse_document/1 and Floki.parse_fragment/1 in order to correct parse documents and fragments of documents - it also prevents the confusion and inconsistency of parse/1.
  • Configure dialyxir in order to run Dialyzer easily.

Changed

  • Deprecate Floki.parse/1 and all the functions that uses it underneath. This means that all the functions that accepted HTML as binary are deprecated as well. This includes find/2, attr/4, filter_out/2, text/2 and attribute/2. The recommendation is to use those functions with an already parsed document or fragment.
  • Remove support for Elixir 1.5.

0.23.1 - 2019-12-01

Fixed

  • It fixes the Mochiweb parser when there is an invalid charref.

0.23.0 - 2019-09-11

Changed

  • Remove Mochiweb as a hex dependency. It brings the code from the original project to Floki's codebase - thanks @josevalim

0.22.0 - 2019-08-21

Added

  • Add Floki.traverse_and_update/2 that works in similar way to Floki.map/2 but traverse the tree and update the children elements. The difference from "map" is that this function can create a tree with more or less nodes. - thanks @ericlathrop

Changed

  • Remove support for Elixir 1.4.

0.21.0 - 2019-04-17

Added

  • Add a possibility to filter style tags on Floki.text/2 - thanks @Vict0rynox

Fixed

  • Fix Floki.text/2 to consider the previous filter of js when filtering style - thanks @Vict0rynox
  • Fix typespecs for Floki.filter_out/2 - thanks @myfreeweb

Changed

  • Drop support for Elixir 1.3 and below - thanks @herbstrith

0.20.4 - 2018-09-24

Fixed

  • Fix Floki.raw_html to accept lists as attribute values - thanks @katehedgpeth

0.20.3 - 2018-06-22

Fixed

0.20.2 - 2018-05-09

Fixed

  • Fix Floki.raw_html/1 to correct handle quotes and double quotes on attributes - thanks @grych

0.20.1 - 2018-04-05

Fixed

  • Remove Enumerable.slice/1 compile warning for Floki.HTMLTree - thanks @thecodeboss
  • Fix Floki.find/2 that was failing on HTML that consists entirely of a comment - thanks @ShaneWilton

0.20.0 - 2018-02-06

Added

  • Configurable raw_html/2 to allow optional encode of HTML entities - thanks @davydog187

Fixed

  • Fix serialization of the tree after updating attribute - thanks @francois2metz

0.19.3 - 2018-01-25

Fixed

  • Skip HTML entities encode for Floki.raw_html/1 for script or style tags
  • Add :html_entities app to the list of OTP applications. It fixes production releases.

0.19.2 - 2017-12-22

Fixed

  • (BREAKING CHANGE) Re-encode HTML entities on Floki.raw_html/1.

0.19.1 - 2017-12-04

Fixed

0.19.0 - 2017-11-11

Added

  • Added support for nth-of-type, first-of-type, last-of-type and last-child pseudo-classes - thanks @saleem1337.
  • Added support for nth-child pseudo-class functional notation - thanks @nirev.
  • Added functional notation support for nth-of-type pseudo-class.
  • Added a Contributing guide.

Fixed

  • Format all files according to the Elixir 1.6 formatter - thanks @fcevado.
  • Fix Floki.raw_html to support raw text - thanks @craig-day.

0.18.1 - 2017-10-13

Added

Fixed

  • Fix XML tag when building HTML tree.
  • Return empty list when Floki.filter_out/2 result is empty.

0.18.0 - 2017-08-05

Added

  • Added Floki.attr/4 that receives a function enabling manipulation of attribute values - thanks @erikdsi.
  • Implement the String.Chars protocol for Floki.Selector.
  • Implement the Enumerable protocol for Floki.HTMLTree.

Changed

  • Changed Floki.transform/2 to Floki.map/2 and Floki.Finder.apply_transform/2 to Floki.Finder.map/2 - thanks @aphillipo.

Fixed

  • Fix Floki.raw_html/1 to consider XML prefixes - thanks @sergey-kintsel.
  • Fix raw_html for self closing tags with content - thanks @navinpeiris.

Removed

  • Removed support for Elixir 1.2.

0.17.2 - 2017-05-25

Fixed

0.17.1 - 2017-05-22

Fixed

  • Fix search when body has unencoded angles (< and >) - thanks @sergey-kintsel
  • Fix crash caused by XML declaration inside body - thanks @erikdsi
  • Fix issue when finding fails if HTML begins with XML tag - thanks @sergey-kintsel

0.17.0 - 2017-04-12

Added

  • Add support for multiple pseudo-selectors, line :not() and :nth-child() - thanks @jjcarstens
  • Add support for multiple selectors inside the :not() pseudo-class selector - thanks @jjcarstens

0.16.0 - 2017-04-05

Added

  • Add support for selectors that only include a pseudo-class selector - thanks @buhman
  • Add support for a new selector: fl-contains, which returns elements that contains a given text - thanks @buhman

Fixed

  • Fix :not() pseudo-class selector to accept simple pseudo-class selectors as well - thanks @mischov

0.15.0 - 2017-03-14

Added

  • Added support for the :not() pseudo-class selector.

Fixed

  • Fixed pseudo-class selectors that are used in conjunction with combinators - thanks @Eiji7
  • Fixed order of elements after search using descendant combinator - thanks @Eiji7

0.14.0 - 2017-02-07

Added

  • Added support for configuring html5ever as the HTML parser. Issue #83 - thanks @hansihe and @aphillipo!

0.13.2 - 2017-02-07

Fixed

  • Fixed bug that was causing Floki.text/1 and Floki.filter_out/2 to ignore "trees" with only text nodes. Issue #91 - thanks @boydm.

0.13.1 - 2017-01-22

Fixed

  • Fix ordering of duplicated descendant matches - thanks @mmmries
  • Fix ordering of Floki.text/1 when there are only root nodes - thanks @mmmries

0.13.0 - 2017-01-22

Added

  • Floki.filter_out/2 is now able to understand complex selectors to filter out from the tree.

0.12.1 - 2017-01-20

Fixed

  • Fix search for elements using descendant combinator - issue #84 - thanks @mmmries

0.12.0 - 2016-12-28

Added

  • Add basic support for nth-child pseudo-class selector. Closes issue #64.

Changed

  • Remove support for Elixir 1.1 and below.
  • Remove public documentation for internal code.

0.11.0 - 2016-10-12

Added

  • First attempt to transform nodes with Floki.transform/2. It is not able to update the tree yet, but works good with results from Floki.find/2 - thanks @bobjflong

Changed

  • Using Logger to notify unknown tokens in selector parser - thanks @teamon and @geonnave
  • Replace mochiweb_html with mochiweb package. This is needed to fix conflict with other packages that are using mochiweb. - thanks @aphillipo

0.10.1 - 2016-08-28

Fixed

  • Fix sibling search after immediate children - thanks @gmile.

0.10.0 - 2016-08-05

Changed

  • Change the search for namespaced elements using the correct CSS3 syntax.

Fixed

  • Fix the search for child elements when is more than two elements deep - thanks @gmile

0.9.0 - 2016-06-16

Added

  • A separator between text when getting text from nodes - thanks @rochdi.

0.8.1 - 2016-05-20

Added

  • Support rendering boolean attributes on Floki.raw_html/1 - thanks @iamvery.

Changed

  • Update Mochiweb HTML parser dependency to version 2.15.0.

0.8.0 - 2016-03-06

Added

  • Add possibility to search tags with namespaces.
  • Accept Floki.Selector as parameter of Floki.find/2 instead of only strings - thanks @hansihe.

Changed

  • Using a smaller package with only the Mochiweb HTML parser.

0.7.2 - 2016-02-23

Fixed

  • Replace <br> nodes by newline (\n) in DeepText - thanks @maxneuvians.
  • Allow FilterOut to filter special nodes, like comment.

0.7.1 - 2015-11-14

Fixed

  • Ignore PHP scripts when finding nodes.

0.7.0 - 2015-11-03

Added

  • Add support for excluding script notes in Floki.text. By default, it will exclude those nodes, but it can be enabled with the flag js: true - thanks @vikeri!

Fixed

  • Fix find for sibling nodes when the precedent selector match an element at the end of sibling list - fix issue #39

0.6.1 - 2015-10-11

Fixed

  • Fix the Floki.raw_html/1 to build HTML comments properly.

0.6.0 - 2015-10-07

Added

  • Add Floki.raw_html/2.

0.5.0 - 2015-09-27

Added

  • Add the child combinator to Floki.find/2.
  • Add the adjacent sibling combinator to Floki.find/2.
  • Add the general adjacent sibling combinator to Floki.find/2.

0.4.1 - 2015-09-18

Fixed

  • Ignoring other files that are not lexer files (".xrl") under src/ directory in Hex package. This fixes a crash when compiling using OTP 17.5 on Mac OS X. Huge thanks to @henrik and @licyeus that pointed the issue!

0.4.0 - 2015-09-17

Added

  • A robust representation of selectors in order to enable queries using a mix of selector types, such as classes with attributes, attributes with types, classes with classes and so on. Here is a list with examples of what is possible now:
    • Floki.find(html, "a.foo")
    • Floki.find(html, "a.foo[data-action=post]")
    • Floki.find(html, ".foo.bar")
    • Floki.find(html, "a.foo[href$='.org']") Thanks to @licyeus to point out the issue!
  • Include mochiweb in the applications list at mix.exs - thanks @EricDykstra

Changed

  • Floki.find/2 will now return a list instead of tuple when searching only by IDs. For now on, Floki should always return the results inside a list, even if it's an ID match.

Removed

  • Floki.find/2 does not accept tuples as selectors anymore. This is because with the robust selectors representation, it won't be necessary to query directly using tuples or another data structures rather than string.

0.3.3 - 2015-08-23

Fixed

  • Fix Floki.find/2 when there is a non-HTML input. It closes the issue #17

0.3.2 - 2015-06-27

Fixed

  • Fix Floki.DeepText when there is a comment inside nodes.

0.3.1 - 2015-06-21

Fixed

  • Fix Floki.find/2 to consider XML trees.

0.3.0 - 2015-06-07

Added

  • Add attribute equals selector. This feature enables the user to search using HTML attributes other than "class" or "id". E.g: Floki.find(html, "[data-model=user]") - @nelsonr

0.2.1 - 2015-06-04

Fixed

  • Fix parse/1 when parsing a part of HTML without a root node - @antonmi

0.2.0 - 2015-05-03

Added

  • Support HTML string when searching for attributes with Floki.attribute/2.
  • Option for Floki.text/2 to disable deep search and use flat search instead.

Changed

  • Change Floki.text/1 to perform a deep search of text nodes.
  • Consider doctests in the test suite.

0.1.1 - 2015-03-25

Added

Changed

  • Using MochiWeb as a hex dependency instead of embedded code. It closes the issue #5

0.1.0 - 2015-02-15

Added

  • Descendant selectors, like ".class tag" to Floki.find/2.
  • Multiple selection, like ".class1, .class2" to Floki.find/2.

0.0.5 - 2014-12-21

Added

  • Floki.text/1, which returns all text in the same level of the parent element inside HTML.

Changed

  • Elixir version requirement from "~> 1.0.0" to ">= 1.0.0".