This is a data analysis exercise I decided to do after hearing the "nothing to hide" theory for Dota 2 player's data. I'm playing the devil's advocate and trying to prove that theory using publicly available data on www.dotabuff.com (sorry for the heavy scraping guys!).
"Nothing to hide": anonymous players negatively impact the performance of their team Meaning: if a team has a higher number of anonymous players, its odds to win are lower
- Anonymous players are the majority of players
- Overall they do not affect a match result
** But I need more data to test the advanced theory: a big disparity in anonymous players between the teams will negatively impact the match outcome.**
- Straightforward global test
- Amount of difference in anonymous players to make an impact
- Whether the high number of anonymous players in a game makes the outcome hard to predict
- Overall anonymous have very low impact
- There will be a sweet spot where they have a significant impact, ex: the 3 anonymous players are all in one team
- Prove that anonymous players are not intrinsically bad: many valid reasons to be anonymous other than hiding your poor performance
- Prove that a bad repartition of anonymous players can skew the results of the game, thus proving the need of balancing the amount of anonymous players via matchmaking (maybe a recommendation for Valve)