Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

leftmost_find_iter_from_iter_from? #105

Open
Gowee opened this issue Nov 29, 2024 · 1 comment
Open

leftmost_find_iter_from_iter_from? #105

Gowee opened this issue Nov 29, 2024 · 1 comment

Comments

@Gowee
Copy link

Gowee commented Nov 29, 2024

Hi!

Would it be possible add leftmost_find_iter_from_iter_from in the future? Is it just not implemented yet or technically infeasible?

@Gowee Gowee changed the title leftmost_find_iter_from_iter_from leftmost_find_iter_from_iter_from? Nov 29, 2024
@vbkaisetsu
Copy link
Member

vbkaisetsu commented Dec 3, 2024

Technical reasons.
While the standard match can accept an iterator, the left-most match must accept a seekable array.

Here is an example:

patterns: ["abbabbb", "abb"]
haystack: "abbabba"
indices:   0123456

The overlapping match works as follows:

0
1
2 <- Outputs [0..3] (abb)
3
4
5 <- Outputs [3..6] (abb)
6
EOS

The left-most match works as follows:

0
1
2 <- Updates output candidate [0..3] (abb)
3
4
5
6 <- Fails. Outputs [0..3], resets the automaton, and seeks to idx=3 (Most recent successful match position)
3
4
5 <- Update output candidate [3..6] (abb)
6
EOS <- Outputs [3..6]

In this paper, the left-most longest match works without seeking the given haystack, but each state must have a variable-length array.
Daachorse represents each state in the space of only 12 bytes for cache efficiency, so we don't want to increase in capacity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants