-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BAR round_update method is wrong? #26
Comments
That would be a really useful contribution which Paul and I would be happy to support you on, if you'd like to put us in the loop! Regarding BAR, I regret not documenting my reasoning better. The overwrite was on purpose, though I really should have commented out line 63. I kept this first formula because it has a natural mathematical interpretation (likely related to the original paper, though I can't recall exactly). However, in practice it was hard to get BAR to work well with large numbers of players. The second formula (line 67) is a solution based on the same reasoning as Elo-MMR: namely, that the number of opponents is so large that we get a near-perfect measurement of each player's performance (but of course, not a perfect measurement of their underlying skill level). This seems to work better. |
Recently, I took on a hobby project for an ELO rating system. Eventually, the codebase got massive. A Discord community needed an ELO system more robust than Microsoft Trueskill. Numerous problems arose from using Trueskill: good players can farm rating from noobs (mainly because their deviation and rating are higher than their true rating), and other problems. Some ways of circumventing this issue are "placement lobbies" (allow noobs to gain a proper rating before affecting other player's rating), adding a priors from the previous season (order the players by relative rating and generate a normal distribution of these players), using game statistics such as kills/assists/deaths/wins/losses, etc. Every season (3 months) the system is adjusted with new methods and parameters tuning. But these added features have their own problems. If a pro player plays noob lobbies, he may not see his rating change for awhile until the noobs gain a proper placement. There is a balance between enjoyment of utilizing an elo system and adding constraints to prevent exploitation. Every feature added came with an exploitation (i.e., enforcing a balancer increased lobby dodging, only allowing games longer than 15 minutes created leavers, etc). Generally, adding more constraints makes the elo system less enjoyable. It is an interesting unique scenario in which the host of the game lobby can determine balance (no matchmaking system). Last season the elo system enforced lobbies to be balanced. Given the sets of team combinations, allow only the intersection of three conditions: the groupings of two players ordered by rating cannot be on the same team (i.e. the top two players cannot be on the same team, the next grouping of two players cannot be on the same team, etc); the average elo rating of both teams must be under a specific threshold; and given a list of game combination sorted by the difference of the elo average of of both teams, only allow top k results. However, even with this enforcement, pro players can still farm noobs. I rewrote your entire codebase in Python to support PyTorch for GPU acceleration. There are also some nice libraries to support convex optimization such as SciPy: https://docs.scipy.org/doc/scipy/tutorial/optimize.html#. Hopefully, by next season, I can get the systems and parameters set (Dec. 20th). I will test some methods and let you know the outcome and perhaps find some ideas to create a more robust system for an elo system for teams. |
Currently, I am trying to convert all the rating systems to support teams.
I have been trying to understand the BAR
round_update
from the paper pertaining according to Algorithm 1 in https://jmlr.csail.mit.edu/papers/volume12/weng11a/weng11a.pdf. The variableinfo
is instantiated on line 51. Line 63 is accumulating, but then on line 67 it is set again, replacing the accumulated value.The text was updated successfully, but these errors were encountered: