-
Notifications
You must be signed in to change notification settings - Fork 876
Support for other langauges #104
Comments
I think you can set a custom scorer - the default is PR #90 also implements more changes on this front, though it appears the dev stopped working on it. I might try and re-submit the PR myself (and hope the tests pass) later tonight, which should further help solve your issue. Let me know if this helps. |
Thank you very much and sorry for my late reply, Thanks again, |
@tester88 you might have some success with a fork I have been working on: https://github.com/medecau/fuzzywuzzy/tree/master |
Great, I will check it out. |
Hi @josegonzalez Any progress on this. It seems that this is not yet supported for Arabic. If I run the following code:
Although, there are differences in terms of 'hamza' characters (ء), and even this one "الأربع" has a totally different letter that doesn't exist in the rest, I still get score of 83 for all of them.
Thanks |
There hasn't been any progress. You'd need to write your own custom scorer to handle non-ascii charactersets. It isn't likely that there will be any progress in this repository as the folks that originally worked on and maintained it have since left the company (myself included). If you find a solution, feel free to post it here for other users. |
Thanks @josegonzalez for getting back to me. I appreciate it. I am sorry to hear that though. Wish you all the best in your new pursuit. |
Have you ever managed to implement this or find a solution of fuzzy string matching for Arabic (python)? I’m looking for something similar as well |
@WorksbyBBS Note that the problem being solved is: fuzzy string match. The nature of what it means "to match" is itself fuzzy. Good luck. |
I no longer support fuzzywuzzy, fuzzywuzzy forks, nor any project started or maintained by seatgeek. |
Hi,
First of all, thanks for maintaining this.
I just noticed that both token_sort_ratio and token_set_ratio don't support Arabic characters. I don't know about other non-English ones but at lease they don't support Arabic..
It's returning 0 as a result of comparing anything with Arabic string. Even if they were 2 Arabic strings..
So I'm just wondering if this's a bug or it simply just doesn't support non-English characters?
Thanks
The text was updated successfully, but these errors were encountered: