Lichess.org provides an open API that offers information about chess games and players. This information is used to calculate statistics about players and to identify instances where a player might be using a chess engine to generate their moves.
Four types of Kafka producers are created, each responsible for querying the Lichess API and producing its output to Kafka topics:
-
Games Producer: Queries https://lichess.org/api/tv/bullet to obtain a list of currently active bullet games. The output of this producer includes information such as the game ID, white and black player IDs, and more.
-
Moves Producer: Queries https://lichess.org/api/stream/game/{gameId} to get a stream of moves in a single game. The output contains information about each move played and the index of the move in the game.
-
Game Result Producer: Similar to the
Moves Producer
this producer queries the same API (https://lichess.org/api/stream/game/{gameId}) but ignores move messages. It only forwards game-related messages and transforms them to obtain information about the game's result. -
Players Producer: Queries https://lichess.org/api/user/{userId} to gather information about players, including their current rank in different game modes and total games played, among other details.
NOTE: Producers may produce duplicate messages for the same event.
Produced messages adhere to schemas described at lichess.api
for corresponding API calls. Some commonly used abbreviations are:
fen : Forsyth-Edwards Notation is the standard notation to describe positions of a chess game
bc, wc: black counter, white counter, reprsents time left for each player in seconds
perf: user’s performance for each game variant (e.g. blitz, bullet..)
Input topics:
yauza.moves
yauza.games
yauza.game-results
yauza.players
Output topics:
yauza.move.score
yauza.game.kpi
yauza.player.kpi
yauza.player.suspicious
Topic | Cleanup Policy |
---|---|
yauza.games | delete |
yauza.moves | delete |
yauza.game-results | delete |
yauza.players | compact |
yauza.move.score | delete |
yauza.game.kpi | compact |
yauza.player.kpi | compact |
yauza.player.suspicious | delete |
NOTE: All topics are configured with compression.type: gzip
.
The Kafka Streams application consumes input topics and performs aggregation and transformation on messages to calculate various KPIs for both players and games.
The Kafka Streams application's topology is shown below:
- Deduplicate messages from each topic.
- Use the players producer as an "initial import" of the players, considering only the first message for each unique player. The goal is to use Yauza to calculate player statistics over time.
- Define a move's category using the Stockfish engine (Stockfish's UCI) and calculate the score for the player after that move.
- For each game, calculate the following KPIs:
- Number of
Brilliant
moves. - Number of
Excellent
moves. - Number of
Good
moves. - Number of
Inaccuracy
moves. - Number of
Mistake
moves. - Number of
Blunder
moves. - Player's accuracy.
- Number of
- For each player, calculate the following KPIs:
- Win count.
- Loss count.
- Draw count.
- Rated games count.
- Number of played games.
- Win/Loss ratio.
- Number of total correct/incorrect moves.
- Correct/Incorrect moves ratio.
- Mean player's accuracy: Number of correct moves / Total number of moves.
- Macro accuracy: Average of accuracies for each game that the player played.
- Median accuracy: 50th percentile of the accuracies of every game played.
- Standard deviation of accuracy.
- Detect potential cheaters: If a player has three or more games where their accuracy follows:
gameAccuracy >= meanPlayerAccuracy + 2 * STD
The producer and consumer applications, along with the Kafka cluster and Schema registry, are deployed using Docker containers. The containers are orchestrated using the docker-compose
tool. Configuration details can be found in the ./infrastructure/docker-compose.yml
file. All relevant files for setting up and deploying the cluster are inside infrastucture folder.