Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why dependency tree parser generates different parse trees of the same sentence? #990

Closed
ritwikmishra opened this issue Apr 4, 2022 · 3 comments
Labels

Comments

@ritwikmishra
Copy link

I was working on a project which involves dependency parsing of Hindi sentences and then performing "further operations" on it. I recently tried to replicate some of my previous results but I was surprised to see different results for the same sentences I used previously. My code for "further operations" is rule-based and unchanged, yet my results were different.

Fortunately I saved some of the dependency trees from my previous experiment, and I noticed that the stanza tool generates slightly different dependency trees now. It usually happens with long sentences. For example consider the following sentence:

कालांतर में इसके हजारों प्रदर्शन हुए और पड़ोसी देश rajsthan , बांग्लादेश और पाकिस्तान में भी इसकी अनेक प्रस्‍तुतियां हुईं ।

The old dependency tree (generated on September 2021) for this sentence was as follows:

                                                            5_हुए                                                                                               
                                                              root                                                                                              
                                                              VERB                                                                                              
   ____________________________________________________________|___________________________                                                                      
  |         |                    |                                                     19_हुईं                                                                  
  |         |                    |                                                        conj                                                                  
  |         |                    |                                                        VERB                                                                  
  |         |                    |                     ____________________________________|________________________________________________                     
  |         |                    |                    |                               9_rajsthan                                            |                   
  |         |                    |                    |                                                                                     |                   
  |         |                    |                    |                                   obl                                               |                   
  |         |                    |                    |                                  PROPN                                              |                   
  |         |                    |                    |         ___________________________|_____________                                   |                    
  |       0_कालांतर             4_प्रदर्शन                 |     8_देश         11_बांग्लादेश                  13_पाकिस्तान                           18_प्रस्‍तुतियां            
  |        obl                                        |       nmod                                                                                              
  |        NOUN               compound                |       NOUN          conj                        conj                             compound               
  |         |                   NOUN                  |        |           PROPN                       PROPN                               NOUN                 
  |         |          __________|___________         |        |             |              _____________|___________          _____________|______________      
20_।       1_में      2_इसके                  3_हजारों    6_और   7_पड़ोसी        10_,         12_और           14_में       15_भी   16_इसकी                      17_अनेक  
punct      case      nmod                  nummod     cc      amod         punct           cc           case        dep      nmod                         det   
PUNCT      ADP       PRON                   NUM     CCONJ     ADJ          PUNCT         CCONJ          ADP         PART     PRON                         DET   

But the new dependency tree (generated on 28 March 2022) is as follows:

                                                              5_हुए                                                                                             
                                                              root                                                                                             
                                                              VERB                                                                                             
   ____________________________________________________________|___________________________                                                                    
  |         |                    |                                                       19_हुईं                                                                 
  |         |                    |                                                        conj                                                                 
  |         |                    |                                                        VERB                                                                 
  |         |                    |                     ____________________________________|________________________________________________                   
  |         |                    |                    |                               9_rajsthan                                            |                  
  |         |                    |                    |                                                                                     |                  
  |         |                    |                    |                                   obl                                               |                  
  |         |                    |                    |                                  PROPN                                              |                  
  |         |                    |                    |         ___________________________|_____________                                   |                  
  |       0_कालांतर            4_प्रदर्शन                  |       8_देश         11_बांग्लादेश                13_पाकिस्तान                          18_प्रस्‍तुतियां              
  |        obl                                        |       nmod                                                                                             
  |        NOUN               compound                |       NOUN          conj                        conj                              nsubj                
  |         |                   NOUN                  |        |           PROPN                       PROPN                               NOUN                
  |         |          __________|___________         |        |             |              _____________|___________          _____________|______________    
20_।        1_में     2_इसके                3_हजारों      6_और   7_पड़ोसी        10_,            12_और         14_में      15_भी    16_इसकी                      17_अनेक 
punct      case      nmod                  nummod     cc      amod         punct           cc           case        dep      nmod                         det  
PUNCT      ADP       PRON                   NUM     CCONJ     ADJ          PUNCT         CCONJ          ADP         PART     PRON                         DET  

Notice the word with index=18 i.e. 18_प्रस्‍तुतियां, in the old dependency tree its dependency relation is compound but in the new dependency tree its dependency relation is nsubj.

I downgraded my stanza library to the version which was released in Aug 2021. Yet it generates the new dependency tree.

What might be the reason behind this behavior?

@AngledLuffa
Copy link
Collaborator

AngledLuffa commented Apr 4, 2022 via email

@ritwikmishra
Copy link
Author

I use

import stanza
stanza.download('hi')

to download the models, how to download older models?

@AngledLuffa
Copy link
Collaborator

AngledLuffa commented Apr 9, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants