-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wrong capitalized letter in Portuguese names #72
Comments
Thanks for the ticket. I can add "do" to the prefixes which will fix the capitalization. What should the correct parsing of the name be? This is what it does currently. Is that correct? $ python tests.py "Joao da Silva do Amaral de Souza"
<HumanName : [
title: ''
first: 'Joao'
middle: 'da Silva do Amaral de'
last: 'Souza'
suffix: ''
nickname: ''
]> |
If that's not the correct way to parse Portuguese names, just open another ticket and I'll fix it. |
Hello @derek73, sorry for the delay. Yes, it is correct. Thank you. |
hey @kelvins, I'm working on fixing a different bug and my fix results in a new parsing of your example name where everything ends up in the last name: $ python tests.py "Joao da Silva do Amaral de Souza"
<HumanName : [
title: ''
first: 'Joao'
middle: ''
last: 'da Silva do Amaral de Souza'
suffix: ''
nickname: ''
]> My understanding of Portuguese last names is limited, but it seems like it could be more technically correct to include them all as last names since they appear to be joined by conjunctions. How do you feel about that? If it's obviously not correct to you, I'd like to know the reasons why it's not correct so I can figure out how to change my fix for #60. |
Hello @derek73, to be honest, in Brazilian Portuguese we do not use (only) the last name too much. We usually use the entire surname, for example: First name: João Personally, I don't think it is correct to say If you're interested, here are some other examples of Brazilian names:
I hope it helps. |
Thanks, that's what I needed. The table is super helpful. Here's my interpretation, let me know if I got anything wrong. do/da/de/dos/das/des are always connected to the piece that follows it, so we can always treat those as one name piece. Then, the attribute bucket that they belong to (first, middle, etc) is determined by the position of the pair (e.g. da + following_piece) in the name, just like if the pair was a single name piece. As long as those articles are connected to the thing that follows them, we should be able to parse out the position just like any other name if we treat them like one piece. I think I can do that. I think it just means that middle names can have prefixes too and right now I only allow last names. Also, I noticed I only have the plural of one of those articles right now. I don't have "des" and "das" as prefixes. Are those possible? I only see a "dos" in your example names. |
Yes, that's exactly it. You can treat the pair (do/da/de/dos + following_piece) as a single name piece. In Portuguese, I have never seen "des" or "das" as prefixes. If it actually exists, it should be very rare. Here is a List of most common surnames in South America. Note that are variations in surnames, for example:
|
while continuing to support multiple names after a prefix #23
I just released v1.0 which I think handles prefixes as I described. This is the output now for your test name: $ python tests.py "Joao da Silva do Amaral de Souza"
<HumanName : [
title: ''
first: 'Joao'
middle: 'da Silva do Amaral'
last: 'de Souza'
suffix: ''
nickname: ''
]> Probably best then to leave "des" and "das" out of the constants if they are rare, it will more likely cause errors than be the desired output. Thanks again for your help on this. I like it when it works right. :) Let me know if I misinterpreted anything. |
Perfect, thank you so much @derek73. |
In portugues, 'do' is a contraction of preposition 'de' plus the article 'o'. 'do' is a contraction of preposition 'de' plus the article 'o', but, never in portuguese you see: In portugueses, there not are this tradition of 'Title', 'first', 'middleName', 'lastname'. I speak portuguese of Brazil. |
First of all, congrats for the great project.
I have found a small issue related to Portuguese names. By running the following code:
I get the following result:
'Joao da Silva Do Amaral de Souza'
when it should be:
'Joao da Silva do Amaral de Souza'
The
d
fromdo
should be lowercase.The text was updated successfully, but these errors were encountered: