Large FFA produces unexpected mu values #23

jhansen461 · 2018-03-13T21:11:33Z

Possibly related to #22

I ran multiple large FFAs. Some of the FFAs consist of large parts of the population while others are much smaller. I noticed that one player who only did a few of the smaller FFAs and performed relatively poorly had the largest mu of all the players while still maintaining a relatively small sigma. Does this appear to be an issue with my setup, this implementation of trueskill, or an issue with trueskill itself?

Here is my setup:
draw_probability = 0, mu = 25, sigma = mu / 3, beta = sigma / 4

I have bolded the matches where both players competed. Matches are listed in chronological order.

Player 1 (identifier externally as the best player):
trueskill.Rating(mu=51.219, sigma=3.449) 1 / 979
trueskill.Rating(mu=40.846, sigma=1.768) 13 / 890
trueskill.Rating(mu=38.448, sigma=1.334) 18 / 727
trueskill.Rating(mu=38.392, sigma=1.132) 3 / 800
trueskill.Rating(mu=38.980, sigma=1.049) 1 / 711
trueskill.Rating(mu=39.408, sigma=0.988) 1 / 578
trueskill.Rating(mu=39.387, sigma=0.911) 2 / 503
trueskill.Rating(mu=39.664, sigma=0.874) 1 / 355
trueskill.Rating(mu=39.789, sigma=0.851) 1 / 687
trueskill.Rating(mu=39.919, sigma=0.852) 2 / 139
trueskill.Rating(mu=39.947, sigma=0.851) 18 / 132
trueskill.Rating(mu=39.382, sigma=0.848) 8 / 128
trueskill.Rating(mu=39.404, sigma=0.851) 2 / 129
trueskill.Rating(mu=40.144, sigma=0.851) 1 / 116
trueskill.Rating(mu=39.502, sigma=0.847) 8 / 115
trueskill.Rating(mu=39.386, sigma=0.849) 1 / 80
trueskill.Rating(mu=39.386, sigma=0.853) 1 / 122
trueskill.Rating(mu=38.502, sigma=0.789) 34 / 1817
trueskill.Rating(mu=37.862, sigma=0.739) 16 / 1629
trueskill.Rating(mu=37.462, sigma=0.698) 8 / 1354
trueskill.Rating(mu=37.562, sigma=0.686) 1 / 1418
trueskill.Rating(mu=37.714, sigma=0.672) 1 / 1304
trueskill.Rating(mu=37.354, sigma=0.642) 10 / 1081
trueskill.Rating(mu=37.001, sigma=0.617) 17 / 975
trueskill.Rating(mu=36.832, sigma=0.596) 4 / 919
trueskill.Rating(mu=36.538, sigma=0.577) 11 / 1237
trueskill.Rating(mu=38.168, sigma=0.579) 9 / 202
trueskill.Rating(mu=37.909, sigma=0.579) 112 / 194
trueskill.Rating(mu=38.314, sigma=0.580) 22 / 182
trueskill.Rating(mu=39.261, sigma=0.580) 10 / 177
trueskill.Rating(mu=38.636, sigma=0.579) 37 / 171
trueskill.Rating(mu=39.591, sigma=0.580) 16 / 166
trueskill.Rating(mu=39.939, sigma=0.582) 2 / 168
trueskill.Rating(mu=39.716, sigma=0.581) 37 / 186

Player 2 (the best player according to trueskill):
trueskill.Rating(mu=41.308, sigma=2.696) 134 / 139
trueskill.Rating(mu=76.557, sigma=1.677) 69 / 132
trueskill.Rating(mu=69.771, sigma=1.357) 115 / 128
trueskill.Rating(mu=72.300, sigma=1.146) 83 / 129
trueskill.Rating(mu=75.554, sigma=1.035) 95 / 116
trueskill.Rating(mu=78.606, sigma=0.942) 87 / 115
trueskill.Rating(mu=87.675, sigma=0.878) 5 / 80
trueskill.Rating(mu=88.466, sigma=0.814) 72 / 122

sublee · 2018-03-18T04:31:08Z

I don't understand your issue. Please explain again with short sentences to let me know:

FFAs meaning
what's the expected result
which result is weird

jhansen461 · 2018-03-25T20:10:26Z

FFA = Free for all. Basically a leaderboard rather than a single winner/loser. The ranking is what is at the end of each line.
I am expecting player 1's trueskill (37.973) to be greater than player 2's trueskill (86.024)
Player 2's mu goes up very quickly (with sigma constantly decreasing) despite only doing very well in the 2nd to last result. Getting the 5 / 80 even decreased sigma even though player 2 is always in the bottom half in every other game. Meanwhile, player 1 is often getting 1st place in multiple games, yet has a trueskill that is substantially lower than player 2.

sublee · 2018-03-30T05:17:43Z

Can I get the full match result set to understand each number? I guess Player 2 has not finished enough games.

bernd-wechner · 2018-04-24T06:53:05Z

Without the full match history there's not comment to make here. sublee is being very patient with you jhansen461. The bottom line is if trueskill thinks player 2 is the best there are generally three possible reasons for this:

Player 2 has beaten Player 1 a lot (this effectively rewards Player 2 over Player 1 with mu growth)
Player 2 has beaten more people than Player 1 (this also effectively rewards Player 2 with more mu growth than Player 1 generally)
Player 2 has played more games than player 1 (over time sigma shrinks and so the ranking which is mu-3*sigma) goes up simply by virtue of recording results any results).

It's not clear from your info at all in which games player 1 and 2 are playing together and which ones they aren't, if the lists are exhaustive (i.e the whole history of Player 1 and Player 2) and what your external identifier is.

The only way to be sure of what's going on is to step through all the match results and appraise. But you can do that too. Find one that puzzles you look at the rankings and the updates and ask questions about that.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Large FFA produces unexpected mu values #23

Large FFA produces unexpected mu values #23

jhansen461 commented Mar 13, 2018 •

edited

Loading

sublee commented Mar 18, 2018

jhansen461 commented Mar 25, 2018

sublee commented Mar 30, 2018 •

edited

Loading

bernd-wechner commented Apr 24, 2018

Large FFA produces unexpected mu values #23

Large FFA produces unexpected mu values #23

Comments

jhansen461 commented Mar 13, 2018 • edited Loading

sublee commented Mar 18, 2018

jhansen461 commented Mar 25, 2018

sublee commented Mar 30, 2018 • edited Loading

bernd-wechner commented Apr 24, 2018

jhansen461 commented Mar 13, 2018 •

edited

Loading

sublee commented Mar 30, 2018 •

edited

Loading