Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reproducing results from the java implementation #19

Closed
azehe opened this issue Feb 15, 2021 · 6 comments
Closed

Reproducing results from the java implementation #19

azehe opened this issue Feb 15, 2021 · 6 comments
Assignees
Labels
bug Something isn't working question Further information is requested

Comments

@azehe
Copy link

azehe commented Feb 15, 2021

Thanks for the package!

  • Problem
    I'm currently trying to switch from the "official" implementation (https://gamma.greyc.fr/) to this one, but I'm having trouble getting to the same results.

  • Reproducing
    As an example, I tried the Alex, Paul, Suzan data from the java web app. Converted to the format the continuum expects, it looks as follows:

Alex,1,2,12
Alex,2,13,19
Alex,5,24,30
Alex,6,32,36
Alex,6,36,44
Alex,7,49,60
Paul,1,2,9
Paul,3,11,17
Paul,5,19,25
Paul,6,32,44
Suzan,1,2,9
Suzan,4,11,17
Suzan,5,21,27
Suzan,6,32,36
Suzan,6,36,40
Suzan,6,42,46
Paul,7,48,58
Suzan,7,48,58

In the web app, this gives me γ = 0.451034437799.

Using the same data in your implementation gives a different value for gamma:

from pyannote.core import Segment
from pygamma_agreement import Continuum, CombinedCategoricalDissimilarity

continuum = Continuum.from_csv("test_data/aps.csv")
dissim = CombinedCategoricalDissimilarity(list(continuum.categories), alpha=1, beta=1)
gamma_results = continuum.compute_gamma(dissim, precision_level=0.01)

print(f"The gamma for that annotation is f{gamma_results.gamma}")

The gamma for that annotation is f0.5044930080375523

I've also tried my own data, where the results sometimes differ even more (negative vs. positive).

I'm using alpha=1 and beta=1, which, as I understand from the Gamma paper, seem to be the default values. However, I'm not sure whether these are the values used in the java implementation and haven't managed to find out.

Is there any parameter I'm missing or setting to a wrong value?

  • Environment
    I'm using pygamma_agreement==0.1.6 with python 3.8.7 on Fedora 33.

PS: Curiously, I just noticed that I'm also getting different results from the java web app and the java offline app, which gives me gamma=0.55

@Rachine
Copy link
Collaborator

Rachine commented Feb 15, 2021

Hi, thank you for issue!

The software from Mathet et al. https://gamma.greyc.fr/ is not using the parameters that have been mentionned in the paper. We found the same discrepancy as you when we did the re-implementation.
The software is using alpha=1 et beta=3. It was confirmed to us by the original authors for the alpha and beta params.
Besides, there might be slight differences +- 0.01 with their implementation. We have one main suspect:
The shuffling/sampling methodology they mentionned in their paper might be hard to replicate without access to their code. So we made the most rationale choices.

@azehe
Copy link
Author

azehe commented Feb 15, 2021

Hi,
thanks for the quick reply! With these parameters, I'm getting gamma=0.405, which is still a good bit off from the java result.
With my own data, the difference is much larger: gamma=-0.07 (-0.13 <= gamma <= -0.02) for the java version and gamma=0.122 for the python version. I can upload a sample of my data if that helps.
Any suggestions on how I could debug this further? Is that in the range that you would expect from the shuffling method?

@Rachine
Copy link
Collaborator

Rachine commented Feb 15, 2021

Yes a sample of your data might help us to understand how you get this discrepancy indeed!

Hi,
thanks for the quick reply! With these parameters, I'm getting gamma=0.405, which is still a good bit off from the java result.
With my own data, the difference is much larger: gamma=-0.07 (-0.13 <= gamma <= -0.02) for the java version and gamma=0.122 for the python version. I can upload a sample of my data if that helps.

If i well understood, you obtained the negative gamma value -0.07 with the java version?

Any suggestions on how I could debug this further? Is that in the range that you would expect from the shuffling method?

What range are you referring to ?

@Rachine Rachine self-assigned this Feb 15, 2021
@azehe
Copy link
Author

azehe commented Feb 15, 2021

Yes a sample of your data might help us to understand how you get this discrepancy indeed!

This is the data I'm currently using: continuum.csv.gz
Note that I'm just experimenting with this data and comparing a simple baseline method to manual annotations, therefore the agreement is expected to be low.

If i well understood, you obtained the negative gamma value -0.07 with the java version?

Exactly. It also gives a range (probably a kind of confidence interval), which is the -0.13 <= gamma <= -0.02 that I reported.

What range are you referring to ?

You said that there could be slight differences to the original implementation, that's what I meant by "range".

@hadware hadware added bug Something isn't working question Further information is requested labels Feb 15, 2021
@Rachine
Copy link
Collaborator

Rachine commented Feb 17, 2021

Hi,

After investigations, your low value of gamma might come that there are many splits of your pred timeline (a segment t1-tn transformed in t1-t2, t2-t3,... t(n-1)-tn) This type error is heavily penalized by the gamma as it is looking for an alignment.
Besides, the gamma agreement has been designed to be an agreement between two annotators, not exactly as a metric for ML systems.
Therefore, we think that the chance should not depend on the pred timeline, to be able to compare all systems. we implemented this option ground_truth_annotators but it is not documented:
https://github.com/bootphon/pygamma-agreement/blob/master/pygamma_agreement/continuum.py#L173

As mentioned in this other issue #16 the use of the gamma as a metric remains an open research question we think.

@ghost
Copy link

ghost commented Aug 25, 2021

As update v0.2.0 fixes any problems regarding differences between the java implementation and ours', and those differences are explained in the new "Issues" section of the documentation, this issue is outdated.

@ghost ghost closed this as completed Aug 25, 2021
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants