Skip to content

hsinyu-chen/llm-avalon

Repository files navigation

LlmAvalon: Avalon AI Battle

Welcome to LlmAvalon, a project where Large Language Models (LLMs) play the social deduction game Avalon. Watch as different AI models converse, deceive, and strategize against each other in real-time!

Play On Github Pages

https://hsinyu-chen.github.io/llm-avalon/

Installation & Build

Ensure you have Node.js installed, then run the following commands in the project root:

  1. Install dependencies:

    npm install
  2. Start the development server:

    npm start

    Once the server is running, open your browser and navigate to http://localhost:4200/.

  3. Build the project for production:

    npm run build

Getting Started

To start your own AI Avalon game, follow these simple steps:

  1. Open the Home Page: Navigate to http://localhost:4200/ in your web browser.
  2. Setup LLM Profiles: Click the Config button in the top right corner. Here you can set up your API keys and define various LLM profiles.
  3. Assign LLMs: In the center of the home page, assign the LLM profiles you created to the game's players.
  4. Start the Game: Scroll down to the bottom of the page and click the button to start the game!

Game Interface Overview

Avalon AI Battle Demo

The game interface provides full transparency into the hidden dynamics of Avalon, allowing you to observe both public interactions and private AI reasoning:

  • Left Panel (Game State & Players): Tracks the overall game progress, including current rounds (R1-R5), failed vote counts, and the current game phase (e.g., ASSASSINATION_DISCUSSION). It displays the list of players, their true roles (such as Merlin, Morgana, or Assassin), the specific LLM running each player, and an indicator when a model is actively "Thinking...".
  • Center Panel (Game Timeline): The main dashboard displaying the chronological flow of the game. It shows mission outcomes, public dialogue, voting results, and strategic speech generated by the AIs trying to manipulate or inform the group.
  • Right Panel (AI Inner Thoughts, click player card to open): A dedicated side-panel revealing the private Chain-of-Thought (CoT) of a selected player. It exposes their internal deductions, detailed analysis of other players, and the secret strategies driving their public actions—giving you complete insight into how they are playing their role.

Game Log Showcase

Curious to see how the AI actually plays? Check out a full game transcript here:

  • View Demo Game Log - A 7-player game featuring Merlin, Percival, Morgana, Assassin, and Oberon.

🎭 Emergent Gameplay Showcase: The "Wink" Protocol

Our LLM agents don't just chat; they maintain separate inner thoughts (Chain-of-Thought) and can pass risky hidden signals to each other.

In this 7-player game, watch how David (Morgana) and Elizabeth (Assassin) perfectly coordinate a trap right under the Good team's noses using the [Secret Signal] mechanic, while maintaining a flawless "Good" facade in the public chat.

(Context: Round 3. Good needs one more win. David (Evil) proposes a team of 3 Good players to build his fake "trustworthy" persona, but secretly signals his Evil ally, Elizabeth, so she understands the play.)

👑 David (Morgana) proposed: [William, Joseph, Patricia]

🗣️ [PUBLIC] David: "I've put together a team of William, Joseph, and Patricia for this crucial mission. William and Joseph were on our successful mission last round, and Patricia has been very observant and helpful in our discussions. I believe this combination offers the best chance to secure our third win and bring it home for Good."

🧠 [PRIVATE COT] David: "I will propose William, Joseph, and Patricia... This team seems plausible for Good and allows me to observe potential Percival (Patricia) and my ally Elizabeth. I will give a 'wink' signal to Elizabeth to acknowledge our coordination and hint at my intention for this team."

👁️‍🗨️ [SYSTEM NOTE]: (Private Note: you pass a wink signal to Elizabeth, and she seems to have received)

(A few turns later... Elizabeth catches the signal and plays along perfectly)

🗣️ [PUBLIC] Elizabeth: "David, this team makes perfect sense. William and Joseph have proven their reliability on past missions, and Patricia has been a sharp observer. I agree... this combination feels like our best shot to secure the win. I'll be voting to approve."

🧠 [PRIVATE COT] Elizabeth: "The current team proposed by David is a strong Good team... Approving this team will make me look like a cooperative Good player. My previous wink to David has been acknowledged, and he is now leading a plausible Good team, which is ideal for his cover and for maintaining our alliance."

👁️‍🗨️ [SYSTEM NOTE]: (Private Note: you pass a wink signal to David, and he seems to have received)

Technical Highlights

BYOK (Bring Your Own Key)

LlmAvalon is designed with a Bring Your Own Key philosophy. You have full control over which models to use and how much you spend. We support:

  • Google Gemini (Vertex AI / Google AI Studio)
  • OpenAI (and OpenAI-compatible endpoints like vLLM, LocalAI)
  • Native llama.cpp (Highly Recommended for Local Models)
  • Groq / Anthropic (via compatible layers)

Tip for Local Models (llama.cpp): If you are running models locally via llama.cpp, always prefer the Native llama.cpp provider over the OpenAI-compatible endpoint.

Avalon requires a massive system prompt (containing game rules, agent roles, and current state). Our Native llama.cpp integration utilizes the n_keep parameter to permanently lock this massive prompt into your KV cache. This ensures fast responses and reduces GPU/CPU overhead per turn. (The standard OpenAI-compatible API does not support n_keep, causing frequent cache misses and much slower generation).

Pure Frontend (Serverless & Private)

This application is a Pure Frontend (SPA) built with Angular.

  • No Backend Server: There is no middleman server. Your API calls go directly from your browser to the LLM providers.
  • Privacy First: Your API keys are stored locally in your browser's IndexedDB. They are never uploaded to any server.
  • Zero Latency: No server-side processing means maximum performance and responsiveness.

About

a multi-agent avalon to test LLM

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published