Skip to content

A PHP implementation of the Aho-Corasick string search algorithm. Mirror from https://gerrit.wikimedia.org/g/AhoCorasick - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)

License

Notifications You must be signed in to change notification settings

lorlev/AhoCorasick

 
 

Repository files navigation

Packagist.org

AhoCorasick

AhoCorasick is a PHP implementation of the Aho-Corasick string search algorithm, which is an efficient way of searching a body of text for multiple search keywords.

Here is how you use it:

use AhoCorasick\MultiStringMatcher;

$keywords = new MultiStringMatcher( array( 'ore', 'hell' ) );

$keywords->searchIn( 'She sells sea shells by the sea shore.' );
// Result: array( array( 15, 'hell' ), array( 34, 'ore' ) )

$keywords->searchIn( 'Say hello to more text. MultiStringMatcher objects are reusable!' );
// Result: array( array( 4, 'hell' ), array( 14, 'ore' ) )

Features

The algorithm works by constructing a finite-state machine out of the set of search keywords. The time it takes to construct the finite state machine is proportional to the sum of the lengths of the search keywords. Once constructed, the machine can locate all occurences of all search keywords in any body of text in a single pass, making exactly one state transition per input character.

Contribute

Support

If you are having issues, please let us know.

License

The project is licensed under the Apache license.

About

A PHP implementation of the Aho-Corasick string search algorithm. Mirror from https://gerrit.wikimedia.org/g/AhoCorasick - our actual code is hosted with Gerrit (please see https://www.mediawiki.org/wiki/Developer_access for contributing)

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Languages

  • PHP 99.5%
  • Shell 0.5%