Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add allow list to tokenizer #148

Merged
merged 22 commits into from
Oct 12, 2022
Merged

Conversation

yenwel
Copy link
Contributor

@yenwel yenwel commented Oct 7, 2022

Pull Request

Related issue

Fixes #132

What does this PR do?

  • add allow list configuration to detection, segment and tokenizer
  • add doc and test

PR checklist

Please check if your PR fulfills the following requirements:

  • Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
  • Have you read the contributing guidelines?
  • Have you made sure that the title is accurate and descriptive of the changes?

@meili-bot
Copy link
Contributor

This message is sent automatically

Hello @yenwel,
Thank you very much for contributing to Meilisearch ❤️.
However, the team is not available on the weekend, but they will be back on Monday 😊

@curquiza curquiza requested a review from ManyTheFish October 10, 2022 08:39
Copy link
Member

@ManyTheFish ManyTheFish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @yenwel!
I requested some changes, and they are not always easy because there are some lifetimes issues. Don't hesitate to ask if you need help! 😄

src/tokenizer.rs Outdated Show resolved Hide resolved
src/tokenizer.rs Outdated Show resolved Hide resolved
src/tokenizer.rs Outdated Show resolved Hide resolved
src/detection/mod.rs Outdated Show resolved Hide resolved
src/detection/mod.rs Outdated Show resolved Hide resolved
src/detection/mod.rs Outdated Show resolved Hide resolved
src/segmenter/mod.rs Outdated Show resolved Hide resolved
src/segmenter/mod.rs Show resolved Hide resolved
Copy link
Member

@ManyTheFish ManyTheFish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'll investigate this move tomorrow! But I think we can merge your PR keeping it. 🤔

src/detection/mod.rs Outdated Show resolved Hide resolved
src/detection/mod.rs Outdated Show resolved Hide resolved
Copy link
Member

@ManyTheFish ManyTheFish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!
I let bors run the tests, then if everything passes, your PR will be merged automatically.
Thank you for your contribution despite its difficulty!

bors merge

@bors
Copy link
Contributor

bors bot commented Oct 12, 2022

Build succeeded:

  • tests

@bors bors bot merged commit 3d735c7 into meilisearch:main Oct 12, 2022
@meili-bot
Copy link
Contributor

This message is sent automatically

Thank you for contributing to Meilisearch. If you are participating in Hacktoberfest, and you would like to receive some gift from Meilisearch too, please complete this form.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add an allowlist to the tokenizer builder
3 participants