Implement TrueSkill for more than two teams #11

asyncth · 2024-06-06T13:18:08Z

Disclaimer: this is my attempt to blindly rewrite the trueskill Python package in Rust into this library, I barely understand what's going on there.

The idea here is to have a separate implementation for multi-team scenarios, but still have the 1v1 and 2 teams shortcuts that are currently implemented in master branch since they're likely to be faster. I, however, temporarily removed those shortcuts just so that I can run tests without modifying them. The multi-team implementation is supposed to be gated behind the disabled-by-default trueskill-full feature since this implementation introduces dependencies, while the 1v1 and 2 teams implementations are supposed to be available regardless of what features are enabled.

match_quality and expected_score functions are not yet implemented. Tests seem to be failing due to rating update functions returning results that are different from both values that are expected in the tests and from values returned by Python trueskill and TypeScript ts-trueskill packages, but only slightly (not sure what the issue is?).

Performance is likely horrible, not very happy about Rc<RefCell<Variable>> usage either, which probably could be avoided if I came up with some other way to implement these factor graphs.

In general this implementation feels rather inappropriate for this library right now as it is claimed that this crate is supposed to be lightweight and blazingly fast.

Closes #8.

asyncth · 2024-06-06T13:20:21Z

Also dependencies really could be removed too, I just didn't want to bother implementing matrix stuff.

atomflunder · 2024-06-06T15:44:50Z

Hi @asyncth ,

thank you very much for your contribution! I can't take a look right now but will do so later today.

As for the match_quality and Matrix stuff, I have already implemented those in a separate branch, only the Multi team function itself was missing. See PR #9 and Issue #8.

Maybe we could merge both of these branches?
Anyways thank you very much again for your contribution and your time 😊

asyncth · 2024-06-06T15:59:55Z

Thanks, I would appreciate if you could give some kind of a guess on why rate functions seem to be around ±0.000001 off from the trueskill Python package, specifically I'm wondering if I did something wrong or if that's some kind of rounding issue. Pretty weird issue, considering that values returned from ts-trueskill match values returned from python trueskill package exactly, according to your calculator web app at least.

Maybe we could merge both of these branches?

Sure, probably should refactor that factor graph mess first though.

atomflunder · 2024-06-06T18:49:12Z

Thanks, I would appreciate if you could give some kind of a guess on why rate functions seem to be around ±0.000001 off from the trueskill Python package, specifically I'm wondering if I did something wrong or if that's some kind of rounding issue.

I do not have any clue at the moment, I read through your code and it looks (to me at least) to be the same as the original. I did have a kinda similar issue earlier in this crate's lifespan where the Glicko-2 calculations were slightly off and I put it down to being a rounding error - turns out I did in fact make a mistake and after fixing it the calculations were on point 😅.
That being said, I do think the difference is very small. I would not rule out a legit floating point rounding issue here.

Sure, probably should refactor that factor graph mess first though.

Sounds good. I do think it would be easier to integrate the multi team stuff into the other branch, rather than the other way around, but feel free to do whatever you like. I'm busy until the weekend, but after that I'll do my best to help.

In general this implementation feels rather inappropriate for this library right now as it is claimed that this crate is supposed to be lightweight and blazingly fast.

On that note, I have thought about the idea of this crate having feature flags for the specific rating algorithm to reduce bloat. Compilers will still just ignore the dead code so it's not a speed-up, but it will reduce compile times etc.
Stuff like:

[dependencies]
skillratings = {version = "0.26", features = ["serde", "trueskill", "glicko-2", "all"]}

But I don't know if its very user-friendly. Very open to suggestions here. I might open a separate issue for that discussion.

asyncth · 2024-06-06T19:19:06Z

Do you happen to understand factor graph message passing as a concept in general? I feel like it would be easier to refactor factor_graph if I did understand it, but I don't, so I'll take a look at the implementation in Skills repo tomorrow as well.

asyncth · 2024-06-06T19:21:47Z

Btw the reason I'm doing this now and not back when I did previous commits is that somehow it never came to my mind that I could just copy the other library without understanding anything, I don't know why didn't I think of that immediately... I wonder if we need to credit the trueskill package authors?

atomflunder · 2024-06-07T05:04:28Z

Do you happen to understand factor graph message passing as a concept in general?

Not really to be honest.

I wonder if we need to credit the trueskill package authors?

That's a good point, they have a weird license. We could add a credit for them in the doc comment for the trueskill module.

atomflunder · 2024-06-08T12:38:00Z

Hi again @asyncth ,

I have played around with the code some more and I found out why the calculations were off: The Normal Distribution Library differs slightly from the math implementations of the TrueSkill Python package, the Python package uses some shortcuts and approximations. After changing the math functions to the replicated ones (I have already implemented those earlier), the calculations line up (The rest diff I would say is a rounding error):

I have pushed these changes to the other branch, I hope that is okay 🙂
Need to clean up some stuff still and add tests. Also I think there will be merge conflicts due to the branch being pretty outdated.

asyncth · 2024-06-08T13:37:11Z

Any suggestions on how to get rid of Rc<RefCell<Variable>> usage? Unfortunately I still haven't taken a look at Skills repo like I said I would :(

atomflunder · 2024-06-08T13:39:31Z

Any suggestions on how to get rid of Rc<RefCell> usage?

Not at the moment no. I think the best course of action going forward is to merge the other branch into main, and then refactor later. We also probably should implement the partial play weighting, but that is not too difficult I think.

asyncth · 2024-06-08T13:41:41Z

Thank you for writing the doc comment and making a proper test, so what do you want me to do next?

atomflunder · 2024-06-08T13:44:56Z

Thank you for writing the doc comment and making a proper test, so what do you want me to do next?

Not sure 😅

I am kinda confused myself with the two branches right now, and the merge conflicts look like a headache to me. If you want to, you could copy the trueskill folder of the other branch into here, delete the dependencies, and then we just merge your branch and close the other.

Thoughts?

asyncth · 2024-06-08T13:50:24Z

Yeah, that might work, I'm not totally sure how to copy the commits and keep your authorship though

atomflunder · 2024-06-08T13:51:07Z

Since you did most of the work, I think it's only fair you get the credit

asyncth · 2024-06-08T13:55:01Z

Why though, I really appreciate you writing a doc comment, I hate doing it personally, especially since my English tends to be very inarticulate in non-casual styles

atomflunder · 2024-06-08T14:03:58Z

You could add the co-authored-by tag if you wanted 😊

Co-authored-by: asyncth <[email protected]>

asyncth · 2024-06-08T14:57:05Z

That Implement trueskill_multi_team commit got merged somewhat weirdly, it didn't have the MultiTeamOutcome import despite it being in mod.rs in your branch, not sure what the issue was, hopefully nothing else got affected by this

asyncth · 2024-06-08T15:05:07Z

@atomflunder What's the issue with the Coverage check? Does rustfmt need to be run?

atomflunder · 2024-06-08T15:06:09Z

That Implement trueskill_multi_team commit got merged somewhat weirdly, it didn't have the MultiTeamOutcome import despite it being in mod.rs in your branch, not sure what the issue was, hopefully nothing else got affected by this

I'm looking at the complete diff from this PR, and it looks fine to me. Also I am not sure what the Coverage Check is complaining about lol.

I will do some of the smaller, annoying stuff later, like adding tests for some of the edge cases, fixing some clippy lints, more docs etc. but all of that is not really critical, just as a heads up. 👍
After that, I'll do a version bump and a new release.

If you want to tackle the performance improvements you talked about, feel free to open another PR in the future. In preparation for that, I will also probably add a benchmark, so we have a baseline to compare the performance to.

And I just wanna thank you again for the contribution, it's greatly appreciated. 🙏

atomflunder · 2024-06-08T15:09:46Z

The coverage check ran fine when I merged it, maybe it has something to do with the repository secret API key, I genuinely don't know.

atomflunder and others added 5 commits June 8, 2024 19:10

Implement expected_score_multi_team for TrueSkill

4d1cc48

Implement match_quality_multi_team

d288803

Implement some more tests to satisfy codecov

5bb682c

Implement trueskill_multi_team

2a5c5e5

Co-authored-by: asyncth <[email protected]>

Add test and documentation

0f02755

asyncth force-pushed the add-full-trueskill branch from e8ca9b3 to 0f02755 Compare June 8, 2024 14:50

asyncth marked this pull request as ready for review June 8, 2024 14:58

atomflunder merged commit 4adbb98 into atomflunder:master Jun 8, 2024
3 of 4 checks passed

atomflunder mentioned this pull request Jun 8, 2024

Support multiple teams for TrueSkill algorithm (WIP) #9

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement TrueSkill for more than two teams #11

Implement TrueSkill for more than two teams #11

asyncth commented Jun 6, 2024 •

edited

Loading

asyncth commented Jun 6, 2024

atomflunder commented Jun 6, 2024 •

edited

Loading

asyncth commented Jun 6, 2024 •

edited

Loading

atomflunder commented Jun 6, 2024 •

edited

Loading

asyncth commented Jun 6, 2024

asyncth commented Jun 6, 2024

atomflunder commented Jun 7, 2024 •

edited

Loading

atomflunder commented Jun 8, 2024 •

edited

Loading

asyncth commented Jun 8, 2024

atomflunder commented Jun 8, 2024

asyncth commented Jun 8, 2024

atomflunder commented Jun 8, 2024 •

edited

Loading

asyncth commented Jun 8, 2024

atomflunder commented Jun 8, 2024

asyncth commented Jun 8, 2024

atomflunder commented Jun 8, 2024

asyncth commented Jun 8, 2024

asyncth commented Jun 8, 2024

atomflunder commented Jun 8, 2024

atomflunder commented Jun 8, 2024

Implement TrueSkill for more than two teams #11

Implement TrueSkill for more than two teams #11

Conversation

asyncth commented Jun 6, 2024 • edited Loading

asyncth commented Jun 6, 2024

atomflunder commented Jun 6, 2024 • edited Loading

asyncth commented Jun 6, 2024 • edited Loading

atomflunder commented Jun 6, 2024 • edited Loading

asyncth commented Jun 6, 2024

asyncth commented Jun 6, 2024

atomflunder commented Jun 7, 2024 • edited Loading

atomflunder commented Jun 8, 2024 • edited Loading

asyncth commented Jun 8, 2024

atomflunder commented Jun 8, 2024

asyncth commented Jun 8, 2024

atomflunder commented Jun 8, 2024 • edited Loading

asyncth commented Jun 8, 2024

atomflunder commented Jun 8, 2024

asyncth commented Jun 8, 2024

atomflunder commented Jun 8, 2024

asyncth commented Jun 8, 2024

asyncth commented Jun 8, 2024

atomflunder commented Jun 8, 2024

atomflunder commented Jun 8, 2024

asyncth commented Jun 6, 2024 •

edited

Loading

atomflunder commented Jun 6, 2024 •

edited

Loading

asyncth commented Jun 6, 2024 •

edited

Loading

atomflunder commented Jun 6, 2024 •

edited

Loading

atomflunder commented Jun 7, 2024 •

edited

Loading

atomflunder commented Jun 8, 2024 •

edited

Loading

atomflunder commented Jun 8, 2024 •

edited

Loading