Why dependency tree parser generates different parse trees of the same sentence? #990

ritwikmishra · 2022-04-04T15:10:27Z

I was working on a project which involves dependency parsing of Hindi sentences and then performing "further operations" on it. I recently tried to replicate some of my previous results but I was surprised to see different results for the same sentences I used previously. My code for "further operations" is rule-based and unchanged, yet my results were different.

Fortunately I saved some of the dependency trees from my previous experiment, and I noticed that the stanza tool generates slightly different dependency trees now. It usually happens with long sentences. For example consider the following sentence:

कालांतर में इसके हजारों प्रदर्शन हुए और पड़ोसी देश rajsthan , बांग्लादेश और पाकिस्तान में भी इसकी अनेक प्रस्‍तुतियां हुईं ।

The old dependency tree (generated on September 2021) for this sentence was as follows:

                                                            5_हुए                                                                                               
                                                              root                                                                                              
                                                              VERB                                                                                              
   ____________________________________________________________|___________________________                                                                      
  |         |                    |                                                     19_हुईं                                                                  
  |         |                    |                                                        conj                                                                  
  |         |                    |                                                        VERB                                                                  
  |         |                    |                     ____________________________________|________________________________________________                     
  |         |                    |                    |                               9_rajsthan                                            |                   
  |         |                    |                    |                                                                                     |                   
  |         |                    |                    |                                   obl                                               |                   
  |         |                    |                    |                                  PROPN                                              |                   
  |         |                    |                    |         ___________________________|_____________                                   |                    
  |       0_कालांतर             4_प्रदर्शन                 |     8_देश         11_बांग्लादेश                  13_पाकिस्तान                           18_प्रस्‍तुतियां            
  |        obl                                        |       nmod                                                                                              
  |        NOUN               compound                |       NOUN          conj                        conj                             compound               
  |         |                   NOUN                  |        |           PROPN                       PROPN                               NOUN                 
  |         |          __________|___________         |        |             |              _____________|___________          _____________|______________      
20_।       1_में      2_इसके                  3_हजारों    6_और   7_पड़ोसी        10_,         12_और           14_में       15_भी   16_इसकी                      17_अनेक  
punct      case      nmod                  nummod     cc      amod         punct           cc           case        dep      nmod                         det   
PUNCT      ADP       PRON                   NUM     CCONJ     ADJ          PUNCT         CCONJ          ADP         PART     PRON                         DET

But the new dependency tree (generated on 28 March 2022) is as follows:

                                                              5_हुए                                                                                             
                                                              root                                                                                             
                                                              VERB                                                                                             
   ____________________________________________________________|___________________________                                                                    
  |         |                    |                                                       19_हुईं                                                                 
  |         |                    |                                                        conj                                                                 
  |         |                    |                                                        VERB                                                                 
  |         |                    |                     ____________________________________|________________________________________________                   
  |         |                    |                    |                               9_rajsthan                                            |                  
  |         |                    |                    |                                                                                     |                  
  |         |                    |                    |                                   obl                                               |                  
  |         |                    |                    |                                  PROPN                                              |                  
  |         |                    |                    |         ___________________________|_____________                                   |                  
  |       0_कालांतर            4_प्रदर्शन                  |       8_देश         11_बांग्लादेश                13_पाकिस्तान                          18_प्रस्‍तुतियां              
  |        obl                                        |       nmod                                                                                             
  |        NOUN               compound                |       NOUN          conj                        conj                              nsubj                
  |         |                   NOUN                  |        |           PROPN                       PROPN                               NOUN                
  |         |          __________|___________         |        |             |              _____________|___________          _____________|______________    
20_।        1_में     2_इसके                3_हजारों      6_और   7_पड़ोसी        10_,            12_और         14_में      15_भी    16_इसकी                      17_अनेक 
punct      case      nmod                  nummod     cc      amod         punct           cc           case        dep      nmod                         det  
PUNCT      ADP       PRON                   NUM     CCONJ     ADJ          PUNCT         CCONJ          ADP         PART     PRON                         DET

Notice the word with index=18 i.e. 18_प्रस्‍तुतियां, in the old dependency tree its dependency relation is compound but in the new dependency tree its dependency relation is nsubj.

I downgraded my stanza library to the version which was released in Aug 2021. Yet it generates the new dependency tree.

What might be the reason behind this behavior?

The text was updated successfully, but these errors were encountered:

AngledLuffa · 2022-04-04T19:35:17Z

The models are occasionally retrained when the data is updated. Did you redownload the older models when you downgraded the package?

…

On Mon, Apr 4, 2022 at 8:10 AM Ritwik Mishra ***@***.***> wrote: I was working on a project which involves dependency parsing of Hindi sentences and then performing "further operations" on it. I recently tried to replicate some of my previous results but I was surprised to see different results for the same sentences I used previously. My code for "further operations" is rule-based and unchanged, yet my results were different. Fortunately I saved some of the dependency trees from my previous experiment, and I noticed that the stanza tool generates slightly different dependency trees now. It usually happens with long sentences. For example consider the following sentence: कालांतर में इसके हजारों प्रदर्शन हुए और पड़ोसी देश rajsthan , बांग्लादेश और पाकिस्तान में भी इसकी अनेक प्रस्‍तुतियां हुईं । The old dependency tree (generated on September 2021) for this sentence was as follows: 5_हुए root VERB ____________________________________________________________|___________________________ | | | 19_हुईं | | | conj | | | VERB | | | ____________________________________|________________________________________________ | | | | 9_rajsthan | | | | | | | | | | obl | | | | | PROPN | | | | | ___________________________|_____________ | | 0_कालांतर 4_प्रदर्शन | 8_देश 11_बांग्लादेश 13_पाकिस्तान 18_प्रस्‍तुतियां | obl | nmod | NOUN compound | NOUN conj conj compound | | NOUN | | PROPN PROPN NOUN | | __________|___________ | | | _____________|___________ _____________|______________ 20_। 1_में 2_इसके 3_हजारों 6_और 7_पड़ोसी 10_, 12_और 14_में 15_भी 16_इसकी 17_अनेक punct case nmod nummod cc amod punct cc case dep nmod det PUNCT ADP PRON NUM CCONJ ADJ PUNCT CCONJ ADP PART PRON DET But the new dependency tree (generated on 28 March 2022) is as follows: 5_हुए root VERB ____________________________________________________________|___________________________ | | | 19_हुईं | | | conj | | | VERB | | | ____________________________________|________________________________________________ | | | | 9_rajsthan | | | | | | | | | | obl | | | | | PROPN | | | | | ___________________________|_____________ | | 0_कालांतर 4_प्रदर्शन | 8_देश 11_बांग्लादेश 13_पाकिस्तान 18_प्रस्‍तुतियां | obl | nmod | NOUN compound | NOUN conj conj nsubj | | NOUN | | PROPN PROPN NOUN | | __________|___________ | | | _____________|___________ _____________|______________ 20_। 1_में 2_इसके 3_हजारों 6_और 7_पड़ोसी 10_, 12_और 14_में 15_भी 16_इसकी 17_अनेक punct case nmod nummod cc amod punct cc case dep nmod det PUNCT ADP PRON NUM CCONJ ADJ PUNCT CCONJ ADP PART PRON DET Notice the word with index=18 i.e. 18_प्रस्‍तुतियां, in the old dependency tree its dependency relation is compound but in the new dependency tree its dependency relation is nsubj. I downgraded my stanza library to the version which was released in Aug 2021. Yet it generates the new dependency tree. What might be the reason behind this behavior? — Reply to this email directly, view it on GitHub <#990>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA2AYWLBMEZ6IPPHK6NWHNTVDMA7DANCNFSM5SP3L7DA> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

ritwikmishra · 2022-04-09T08:08:58Z

I use

import stanza
stanza.download('hi')

to download the models, how to download older models?

AngledLuffa · 2022-04-09T16:44:11Z

If you run download() again it will download models appropriate to that version. Otherwise you might be using newer models still, which explains why the results didn't revert

…

On Sat, Apr 9, 2022, 1:09 AM Ritwik Mishra ***@***.***> wrote: I use import stanza stanza.download('hi') to download the models, how to download older models? — Reply to this email directly, view it on GitHub <#990 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AA2AYWPVGZZMN77P4YQU5M3VEE3KJANCNFSM5SP3L7DA> . You are receiving this because you commented.Message ID: ***@***.***>

ritwikmishra added the question label Apr 4, 2022

ritwikmishra closed this as completed Apr 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why dependency tree parser generates different parse trees of the same sentence? #990

Why dependency tree parser generates different parse trees of the same sentence? #990

ritwikmishra commented Apr 4, 2022

AngledLuffa commented Apr 4, 2022 via email

ritwikmishra commented Apr 9, 2022

AngledLuffa commented Apr 9, 2022 via email

Why dependency tree parser generates different parse trees of the same sentence? #990

Why dependency tree parser generates different parse trees of the same sentence? #990

Comments

ritwikmishra commented Apr 4, 2022

AngledLuffa commented Apr 4, 2022 via email

ritwikmishra commented Apr 9, 2022

AngledLuffa commented Apr 9, 2022 via email