Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement TrueSkill for more than two teams #11

Merged
merged 5 commits into from
Jun 8, 2024

Conversation

asyncth
Copy link
Contributor

@asyncth asyncth commented Jun 6, 2024

Disclaimer: this is my attempt to blindly rewrite the trueskill Python package in Rust into this library, I barely understand what's going on there.

The idea here is to have a separate implementation for multi-team scenarios, but still have the 1v1 and 2 teams shortcuts that are currently implemented in master branch since they're likely to be faster. I, however, temporarily removed those shortcuts just so that I can run tests without modifying them. The multi-team implementation is supposed to be gated behind the disabled-by-default trueskill-full feature since this implementation introduces dependencies, while the 1v1 and 2 teams implementations are supposed to be available regardless of what features are enabled.

match_quality and expected_score functions are not yet implemented. Tests seem to be failing due to rating update functions returning results that are different from both values that are expected in the tests and from values returned by Python trueskill and TypeScript ts-trueskill packages, but only slightly (not sure what the issue is?).

Performance is likely horrible, not very happy about Rc<RefCell<Variable>> usage either, which probably could be avoided if I came up with some other way to implement these factor graphs.

In general this implementation feels rather inappropriate for this library right now as it is claimed that this crate is supposed to be lightweight and blazingly fast.

Closes #8.

@asyncth
Copy link
Contributor Author

asyncth commented Jun 6, 2024

Also dependencies really could be removed too, I just didn't want to bother implementing matrix stuff.

@atomflunder
Copy link
Owner

atomflunder commented Jun 6, 2024

Hi @asyncth ,

thank you very much for your contribution! I can't take a look right now but will do so later today.

As for the match_quality and Matrix stuff, I have already implemented those in a separate branch, only the Multi team function itself was missing. See PR #9 and Issue #8.

Maybe we could merge both of these branches?
Anyways thank you very much again for your contribution and your time 😊

@asyncth
Copy link
Contributor Author

asyncth commented Jun 6, 2024

Thanks, I would appreciate if you could give some kind of a guess on why rate functions seem to be around ±0.000001 off from the trueskill Python package, specifically I'm wondering if I did something wrong or if that's some kind of rounding issue. Pretty weird issue, considering that values returned from ts-trueskill match values returned from python trueskill package exactly, according to your calculator web app at least.

Maybe we could merge both of these branches?

Sure, probably should refactor that factor graph mess first though.

@atomflunder
Copy link
Owner

atomflunder commented Jun 6, 2024

Thanks, I would appreciate if you could give some kind of a guess on why rate functions seem to be around ±0.000001 off from the trueskill Python package, specifically I'm wondering if I did something wrong or if that's some kind of rounding issue.

I do not have any clue at the moment, I read through your code and it looks (to me at least) to be the same as the original. I did have a kinda similar issue earlier in this crate's lifespan where the Glicko-2 calculations were slightly off and I put it down to being a rounding error - turns out I did in fact make a mistake and after fixing it the calculations were on point 😅.
That being said, I do think the difference is very small. I would not rule out a legit floating point rounding issue here.

Sure, probably should refactor that factor graph mess first though.

Sounds good. I do think it would be easier to integrate the multi team stuff into the other branch, rather than the other way around, but feel free to do whatever you like. I'm busy until the weekend, but after that I'll do my best to help.

In general this implementation feels rather inappropriate for this library right now as it is claimed that this crate is supposed to be lightweight and blazingly fast.

On that note, I have thought about the idea of this crate having feature flags for the specific rating algorithm to reduce bloat. Compilers will still just ignore the dead code so it's not a speed-up, but it will reduce compile times etc.
Stuff like:

[dependencies]
skillratings = {version = "0.26", features = ["serde", "trueskill", "glicko-2", "all"]}

But I don't know if its very user-friendly. Very open to suggestions here. I might open a separate issue for that discussion.

@asyncth
Copy link
Contributor Author

asyncth commented Jun 6, 2024

Do you happen to understand factor graph message passing as a concept in general? I feel like it would be easier to refactor factor_graph if I did understand it, but I don't, so I'll take a look at the implementation in Skills repo tomorrow as well.

@asyncth
Copy link
Contributor Author

asyncth commented Jun 6, 2024

Btw the reason I'm doing this now and not back when I did previous commits is that somehow it never came to my mind that I could just copy the other library without understanding anything, I don't know why didn't I think of that immediately... I wonder if we need to credit the trueskill package authors?

@atomflunder
Copy link
Owner

atomflunder commented Jun 7, 2024

Do you happen to understand factor graph message passing as a concept in general?

Not really to be honest.

I wonder if we need to credit the trueskill package authors?

That's a good point, they have a weird license. We could add a credit for them in the doc comment for the trueskill module.

@atomflunder
Copy link
Owner

atomflunder commented Jun 8, 2024

Hi again @asyncth ,

I have played around with the code some more and I found out why the calculations were off: The Normal Distribution Library differs slightly from the math implementations of the TrueSkill Python package, the Python package uses some shortcuts and approximations. After changing the math functions to the replicated ones (I have already implemented those earlier), the calculations line up (The rest diff I would say is a rounding error):

Screenshot from 2024-06-08 14-27-19
image

I have pushed these changes to the other branch, I hope that is okay 🙂
Need to clean up some stuff still and add tests. Also I think there will be merge conflicts due to the branch being pretty outdated.

@asyncth
Copy link
Contributor Author

asyncth commented Jun 8, 2024

Any suggestions on how to get rid of Rc<RefCell<Variable>> usage? Unfortunately I still haven't taken a look at Skills repo like I said I would :(

@atomflunder
Copy link
Owner

Any suggestions on how to get rid of Rc<RefCell> usage?

Not at the moment no. I think the best course of action going forward is to merge the other branch into main, and then refactor later. We also probably should implement the partial play weighting, but that is not too difficult I think.

@asyncth
Copy link
Contributor Author

asyncth commented Jun 8, 2024

Thank you for writing the doc comment and making a proper test, so what do you want me to do next?

@atomflunder
Copy link
Owner

atomflunder commented Jun 8, 2024

Thank you for writing the doc comment and making a proper test, so what do you want me to do next?

Not sure 😅

I am kinda confused myself with the two branches right now, and the merge conflicts look like a headache to me. If you want to, you could copy the trueskill folder of the other branch into here, delete the dependencies, and then we just merge your branch and close the other.

Thoughts?

@asyncth
Copy link
Contributor Author

asyncth commented Jun 8, 2024

Yeah, that might work, I'm not totally sure how to copy the commits and keep your authorship though

@atomflunder
Copy link
Owner

Since you did most of the work, I think it's only fair you get the credit

@asyncth
Copy link
Contributor Author

asyncth commented Jun 8, 2024

Why though, I really appreciate you writing a doc comment, I hate doing it personally, especially since my English tends to be very inarticulate in non-casual styles

@atomflunder
Copy link
Owner

You could add the co-authored-by tag if you wanted 😊

@asyncth
Copy link
Contributor Author

asyncth commented Jun 8, 2024

That Implement trueskill_multi_team commit got merged somewhat weirdly, it didn't have the MultiTeamOutcome import despite it being in mod.rs in your branch, not sure what the issue was, hopefully nothing else got affected by this

@asyncth asyncth marked this pull request as ready for review June 8, 2024 14:58
@asyncth
Copy link
Contributor Author

asyncth commented Jun 8, 2024

@atomflunder What's the issue with the Coverage check? Does rustfmt need to be run?

@atomflunder
Copy link
Owner

That Implement trueskill_multi_team commit got merged somewhat weirdly, it didn't have the MultiTeamOutcome import despite it being in mod.rs in your branch, not sure what the issue was, hopefully nothing else got affected by this

I'm looking at the complete diff from this PR, and it looks fine to me. Also I am not sure what the Coverage Check is complaining about lol.

I will do some of the smaller, annoying stuff later, like adding tests for some of the edge cases, fixing some clippy lints, more docs etc. but all of that is not really critical, just as a heads up. 👍
After that, I'll do a version bump and a new release.

If you want to tackle the performance improvements you talked about, feel free to open another PR in the future. In preparation for that, I will also probably add a benchmark, so we have a baseline to compare the performance to.

And I just wanna thank you again for the contribution, it's greatly appreciated. 🙏

@atomflunder atomflunder merged commit 4adbb98 into atomflunder:master Jun 8, 2024
3 of 4 checks passed
@atomflunder
Copy link
Owner

The coverage check ran fine when I merged it, maybe it has something to do with the repository secret API key, I genuinely don't know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support multiple teams in TrueSkill algorithms
2 participants