Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make sklearn embedding backend auto-select more cautious #1984

Merged

Conversation

freddyheppell
Copy link
Contributor

Fixes #1980

  • Checks which module the error ocurred in before selecting an sklearn backend by default. If it was sentence_transformers, select skelarn as it's probably a minimal install. If any other module, re-raise as this is abnormal.
  • Logs an INFO message when the sklearn backend is selected, if verbose is enabled
  • Updates the comments on select_backend as I think they were quite out of date

I ended up instantiating a new logger for this as it felt better to pass verbose than the whole logger.

@freddyheppell freddyheppell changed the title Make sklearn embedding backend more cautious Make sklearn embedding backend auto-select more cautious May 10, 2024
Copy link
Owner

@MaartenGr MaartenGr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I had a few questions/suggestions but other than that it looks good. There is currently an issue with the pipeline as a result of the scikit-learn v1.5 upgrade. To get this pipeline working for you, I would advise implementing the same fix as #2008. That way, we can run the pipeline and see if everything works as intended.

bertopic/backend/_utils.py Show resolved Hide resolved
bertopic/backend/_utils.py Show resolved Hide resolved
@freddyheppell freddyheppell force-pushed the fix-backend-selection-default branch from 57d4a9f to 6ac4cb1 Compare May 30, 2024 13:26
@freddyheppell
Copy link
Contributor Author

freddyheppell commented May 30, 2024

Just rebased this branch onto the latest master so it has the #2008 fix now

@MaartenGr
Copy link
Owner

@freddyheppell Thanks for your patience concerning the pipeline fix. Everything looks good to me so I'll go ahead and merge it. Thanks for your contribution to the repo! 😄

@MaartenGr MaartenGr merged commit d678680 into MaartenGr:master May 31, 2024
5 checks passed
@freddyheppell freddyheppell mentioned this pull request Jun 3, 2024
4 tasks
@freddyheppell freddyheppell deleted the fix-backend-selection-default branch June 3, 2024 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Warn when automatically choosing SklearnEmbedder backend
2 participants