-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IndexError on Strings containing Certain Characters #10
Comments
Same here |
Having the same problem here. |
Hi @matt-buckley @ninikolov and @isu-shrestha , Best, |
Hi @MartinoMensio, I'll contribute a PR with an additional test case, containing a minimal document sample that caused a crash. This way future iterations have a better checking and you can reproduce the issue yourself. |
Hi @dennlinger, Best, |
When running a basic NLP model like en_core_web_lg with the sole addition of an entityLinker pipe, calling nlp() will throw an IndexError on certain strings, particularly those with certain whitespace characters such as newline characters. The error thrown and the line causing the error is:
`def get_candidates_in_sent(self, sent, doc):
----> root = list(filter(lambda token: token.dep == "ROOT", sent))[0]
excluded_children = []
candidates = []
IndexError: list index out of range`
I'm running Python version 3.9, spaCy version 3.2.4, and spaCy-entity-linker version 1.0.1
The text was updated successfully, but these errors were encountered: