Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large FFA produces unexpected mu values #23

Open
jhansen461 opened this issue Mar 13, 2018 · 4 comments
Open

Large FFA produces unexpected mu values #23

jhansen461 opened this issue Mar 13, 2018 · 4 comments

Comments

@jhansen461
Copy link

jhansen461 commented Mar 13, 2018

Possibly related to #22

I ran multiple large FFAs. Some of the FFAs consist of large parts of the population while others are much smaller. I noticed that one player who only did a few of the smaller FFAs and performed relatively poorly had the largest mu of all the players while still maintaining a relatively small sigma. Does this appear to be an issue with my setup, this implementation of trueskill, or an issue with trueskill itself?

Here is my setup:
draw_probability = 0, mu = 25, sigma = mu / 3, beta = sigma / 4

I have bolded the matches where both players competed. Matches are listed in chronological order.

Player 1 (identifier externally as the best player):
trueskill.Rating(mu=51.219, sigma=3.449) 1 / 979
trueskill.Rating(mu=40.846, sigma=1.768) 13 / 890
trueskill.Rating(mu=38.448, sigma=1.334) 18 / 727
trueskill.Rating(mu=38.392, sigma=1.132) 3 / 800
trueskill.Rating(mu=38.980, sigma=1.049) 1 / 711
trueskill.Rating(mu=39.408, sigma=0.988) 1 / 578
trueskill.Rating(mu=39.387, sigma=0.911) 2 / 503
trueskill.Rating(mu=39.664, sigma=0.874) 1 / 355
trueskill.Rating(mu=39.789, sigma=0.851) 1 / 687
trueskill.Rating(mu=39.919, sigma=0.852) 2 / 139
trueskill.Rating(mu=39.947, sigma=0.851) 18 / 132
trueskill.Rating(mu=39.382, sigma=0.848) 8 / 128
trueskill.Rating(mu=39.404, sigma=0.851) 2 / 129
trueskill.Rating(mu=40.144, sigma=0.851) 1 / 116
trueskill.Rating(mu=39.502, sigma=0.847) 8 / 115
trueskill.Rating(mu=39.386, sigma=0.849) 1 / 80
trueskill.Rating(mu=39.386, sigma=0.853) 1 / 122

trueskill.Rating(mu=38.502, sigma=0.789) 34 / 1817
trueskill.Rating(mu=37.862, sigma=0.739) 16 / 1629
trueskill.Rating(mu=37.462, sigma=0.698) 8 / 1354
trueskill.Rating(mu=37.562, sigma=0.686) 1 / 1418
trueskill.Rating(mu=37.714, sigma=0.672) 1 / 1304
trueskill.Rating(mu=37.354, sigma=0.642) 10 / 1081
trueskill.Rating(mu=37.001, sigma=0.617) 17 / 975
trueskill.Rating(mu=36.832, sigma=0.596) 4 / 919
trueskill.Rating(mu=36.538, sigma=0.577) 11 / 1237
trueskill.Rating(mu=38.168, sigma=0.579) 9 / 202
trueskill.Rating(mu=37.909, sigma=0.579) 112 / 194
trueskill.Rating(mu=38.314, sigma=0.580) 22 / 182
trueskill.Rating(mu=39.261, sigma=0.580) 10 / 177
trueskill.Rating(mu=38.636, sigma=0.579) 37 / 171
trueskill.Rating(mu=39.591, sigma=0.580) 16 / 166
trueskill.Rating(mu=39.939, sigma=0.582) 2 / 168
trueskill.Rating(mu=39.716, sigma=0.581) 37 / 186

Player 2 (the best player according to trueskill):
trueskill.Rating(mu=41.308, sigma=2.696) 134 / 139
trueskill.Rating(mu=76.557, sigma=1.677) 69 / 132
trueskill.Rating(mu=69.771, sigma=1.357) 115 / 128
trueskill.Rating(mu=72.300, sigma=1.146) 83 / 129
trueskill.Rating(mu=75.554, sigma=1.035) 95 / 116
trueskill.Rating(mu=78.606, sigma=0.942) 87 / 115
trueskill.Rating(mu=87.675, sigma=0.878) 5 / 80
trueskill.Rating(mu=88.466, sigma=0.814) 72 / 122

@sublee
Copy link
Owner

sublee commented Mar 18, 2018

I don't understand your issue. Please explain again with short sentences to let me know:

  1. FFAs meaning
  2. what's the expected result
  3. which result is weird

@jhansen461
Copy link
Author

  1. FFA = Free for all. Basically a leaderboard rather than a single winner/loser. The ranking is what is at the end of each line.
  2. I am expecting player 1's trueskill (37.973) to be greater than player 2's trueskill (86.024)
  3. Player 2's mu goes up very quickly (with sigma constantly decreasing) despite only doing very well in the 2nd to last result. Getting the 5 / 80 even decreased sigma even though player 2 is always in the bottom half in every other game. Meanwhile, player 1 is often getting 1st place in multiple games, yet has a trueskill that is substantially lower than player 2.

@sublee
Copy link
Owner

sublee commented Mar 30, 2018

Can I get the full match result set to understand each number? I guess Player 2 has not finished enough games.

@bernd-wechner
Copy link
Contributor

Without the full match history there's not comment to make here. sublee is being very patient with you jhansen461. The bottom line is if trueskill thinks player 2 is the best there are generally three possible reasons for this:

  1. Player 2 has beaten Player 1 a lot (this effectively rewards Player 2 over Player 1 with mu growth)
  2. Player 2 has beaten more people than Player 1 (this also effectively rewards Player 2 with more mu growth than Player 1 generally)
  3. Player 2 has played more games than player 1 (over time sigma shrinks and so the ranking which is mu-3*sigma) goes up simply by virtue of recording results any results).

It's not clear from your info at all in which games player 1 and 2 are playing together and which ones they aren't, if the lists are exhaustive (i.e the whole history of Player 1 and Player 2) and what your external identifier is.

The only way to be sure of what's going on is to step through all the match results and appraise. But you can do that too. Find one that puzzles you look at the rankings and the updates and ask questions about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants