-
Notifications
You must be signed in to change notification settings - Fork 898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why dependency tree parser generates different parse trees of the same sentence? #990
Labels
Comments
The models are occasionally retrained when the data is updated. Did you
redownload the older models when you downgraded the package?
…On Mon, Apr 4, 2022 at 8:10 AM Ritwik Mishra ***@***.***> wrote:
I was working on a project which involves dependency parsing of Hindi
sentences and then performing "further operations" on it. I recently tried
to replicate some of my previous results but I was surprised to see
different results for the same sentences I used previously. My code for
"further operations" is rule-based and unchanged, yet my results were
different.
Fortunately I saved some of the dependency trees from my previous
experiment, and I noticed that the stanza tool generates slightly different
dependency trees now. It usually happens with long sentences. For example
consider the following sentence:
कालांतर में इसके हजारों प्रदर्शन हुए और पड़ोसी देश rajsthan , बांग्लादेश और पाकिस्तान में भी इसकी अनेक प्रस्तुतियां हुईं ।
The old dependency tree (generated on September 2021) for this sentence
was as follows:
5_हुए
root
VERB
____________________________________________________________|___________________________
| | | 19_हुईं
| | | conj
| | | VERB
| | | ____________________________________|________________________________________________
| | | | 9_rajsthan |
| | | | |
| | | | obl |
| | | | PROPN |
| | | | ___________________________|_____________ |
| 0_कालांतर 4_प्रदर्शन | 8_देश 11_बांग्लादेश 13_पाकिस्तान 18_प्रस्तुतियां
| obl | nmod
| NOUN compound | NOUN conj conj compound
| | NOUN | | PROPN PROPN NOUN
| | __________|___________ | | | _____________|___________ _____________|______________
20_। 1_में 2_इसके 3_हजारों 6_और 7_पड़ोसी 10_, 12_और 14_में 15_भी 16_इसकी 17_अनेक
punct case nmod nummod cc amod punct cc case dep nmod det
PUNCT ADP PRON NUM CCONJ ADJ PUNCT CCONJ ADP PART PRON DET
But the new dependency tree (generated on 28 March 2022) is as follows:
5_हुए
root
VERB
____________________________________________________________|___________________________
| | | 19_हुईं
| | | conj
| | | VERB
| | | ____________________________________|________________________________________________
| | | | 9_rajsthan |
| | | | |
| | | | obl |
| | | | PROPN |
| | | | ___________________________|_____________ |
| 0_कालांतर 4_प्रदर्शन | 8_देश 11_बांग्लादेश 13_पाकिस्तान 18_प्रस्तुतियां
| obl | nmod
| NOUN compound | NOUN conj conj nsubj
| | NOUN | | PROPN PROPN NOUN
| | __________|___________ | | | _____________|___________ _____________|______________
20_। 1_में 2_इसके 3_हजारों 6_और 7_पड़ोसी 10_, 12_और 14_में 15_भी 16_इसकी 17_अनेक
punct case nmod nummod cc amod punct cc case dep nmod det
PUNCT ADP PRON NUM CCONJ ADJ PUNCT CCONJ ADP PART PRON DET
Notice the word with index=18 i.e. 18_प्रस्तुतियां, in the old
dependency tree its dependency relation is compound but in the new
dependency tree its dependency relation is nsubj.
I downgraded my stanza library to the version which was released in Aug
2021. Yet it generates the new dependency tree.
What might be the reason behind this behavior?
—
Reply to this email directly, view it on GitHub
<#990>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2AYWLBMEZ6IPPHK6NWHNTVDMA7DANCNFSM5SP3L7DA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
I use
to download the models, how to download older models? |
If you run download() again it will download models appropriate to that
version. Otherwise you might be using newer models still, which explains
why the results didn't revert
…On Sat, Apr 9, 2022, 1:09 AM Ritwik Mishra ***@***.***> wrote:
I use
import stanza
stanza.download('hi')
to download the models, how to download older models?
—
Reply to this email directly, view it on GitHub
<#990 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2AYWPVGZZMN77P4YQU5M3VEE3KJANCNFSM5SP3L7DA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I was working on a project which involves dependency parsing of Hindi sentences and then performing "further operations" on it. I recently tried to replicate some of my previous results but I was surprised to see different results for the same sentences I used previously. My code for "further operations" is rule-based and unchanged, yet my results were different.
Fortunately I saved some of the dependency trees from my previous experiment, and I noticed that the stanza tool generates slightly different dependency trees now. It usually happens with long sentences. For example consider the following sentence:
The old dependency tree (generated on September 2021) for this sentence was as follows:
But the new dependency tree (generated on 28 March 2022) is as follows:
Notice the word with index=18 i.e.
18_प्रस्तुतियां
, in the old dependency tree its dependency relation iscompound
but in the new dependency tree its dependency relation isnsubj
.I downgraded my stanza library to the version which was released in Aug 2021. Yet it generates the new dependency tree.
What might be the reason behind this behavior?
The text was updated successfully, but these errors were encountered: