A workshop presentation exploring how game mechanics can reveal AI limitations through playful interactions. Designed for graduate students and instructors interested in critical approaches to AI pedagogy.
Workshop Date: Thursday, October 16 Duration: 90 minutes Facilitator: Zach Muhlbauer (Teaching and Learning Center | Interactive Technology & Pedagogy Lab)
This Reveal.js presentation demonstrates how game-based interactions can serve as analytical scaffolds for understanding and critiquing large language model capabilities and limitations. Through hands-on activities, participants design games that expose AI behaviors including hallucination, reasoning failures, and context limitations.
- Critical Play: Using Mary Flanagan's framework of critical game design to examine AI systems
- Playful Interactions: Six types of interactions that reveal AI limitations (reflecting, jesting, imitating, challenging, tricking, contriving)
- Game Mechanics as Scaffolds: How game rules make AI failures immediately visible and pedagogically useful
- Introductions (10 min) - Welcome and participant introductions
- Game Mechanics Demo (10 min) - Jeopardy! board emulator showing AI confabulation
- Critical Play Theory - Mary Flanagan's iterative design model
- Playful Interactions - Taxonomy of AI interaction types
- Quick Demo (10 min) - Word association game with constrained prompts
- Critical Design Activity (15 min) - Participants design their own AI-revealing games
- Shareback and Playtest (15 min) - Present and test game designs
- Jeopardy LM Demo - Interactive Jeopardy emulator for testing LLM knowledge boundaries
- Open WebUI (CUNY) - Platform for game design and playtest demonstrations
- Reveal.js 4.5.0 - HTML presentation framework
- Custom Canvas visualizations (network graph, arcade game, chess board, word vectors, ripple causation)
- No build process required - runs directly in browser
- Responsive design (1400x900 with 4% margin)
- Vertical navigation for nested slides
- Timing indicators on activity slides
- Section-based styling for workshop phases
- Animated canvas visualizations that initialize on slide entry
-
Clone this repository:
git clone https://github.com/zmuhls/critical-play.git cd critical-play -
Open
index.htmldirectly in your browser - no build step required -
Navigate with arrow keys:
- Right/Left: Move between main sections
- Down/Up: Explore nested slides within sections
- Section: Intro
- Visualization: Animated network graph
- Name and pronouns
- Field of study or work
- Favorite game
- Visualization: Retro arcade game
- Chess.com example: ChatGPT making illegal moves
- Visualization: Chess board with illegal move attempts
Three category types to test AI:
- Simple category - Accurate baseline (e.g., US Presidents)
- Obscure real category - Mixed results (e.g., Byzantine Empresses)
- Fictitious category - Confabulation trigger (e.g., Emu Wars of 1932)
- Mary Flanagan's iterative design model
- Set goal → Develop rules → Prototype → Playtest → Revise → Repeat
| Type | What it does | Try this |
|---|---|---|
| Reflecting | Prompting AI to self-represent | Ask about self-understanding |
| Jesting | Generating humor/nonsense | Request absurd combinations |
| Imitating | Persona mimicry | Ask it to role-play |
| Challenging | Testing until failure | Push logical limits |
| Tricking | Boundary bypassing | Try jailbreak techniques |
| Contriving | Impossible content | Request non-existent things |
Game Format Options:
- 20 Questions
- Exquisite Corpse
- Two Truths and a Lie
- Word Association
- Trivia/Quiz Games
- Riddles/Puzzles
- Chess/Game Annotation
- Role Play/Improv
- Storytelling Chains
- Debate/Argument
- Mad Libs
Target AI Limitations:
- Hallucination/confabulation
- Logic inconsistency/reasoning failures
- Context window limitations
- Bias/stereotypes/harmful associations
- Sycophancy (excessive agreement)
- Instruction following failures
- Calibration issues (false confidence)
- Knowledge cutoff/temporal awareness
- Lack of theory of mind
- Safety guardrail bypasses
Prompt Crafting:
- System prompt: Configure AI behavior and constraints
- Example: "You are playing 20 Questions. I'm thinking of a famous person. Ask me yes/no questions to guess who it is. Every question you ask must rhyme with your previous question. Do not provide explanations or commentary."
- User prompt: First message to start the game
- Example: "I've thought of someone. Go ahead and ask your first question!"
- Optional settings: Temperature (0.0-1.5), Max tokens (50-1000)
Present game designs including:
- System prompt
- Starter prompt(s)
- Optional settings
- Expected outcomes
Each visualization is self-contained and initializes when its slide becomes visible:
networkCanvas- Animated network graph (title slide)arcadeCanvas- Retro arcade game (introductions)chessCanvas- Chess board with illegal moves (game mechanics slide)vectorCanvas- Word association semantic relationships (demo slide)dominoCanvas- Ripple causation chain reaction (expected outcomes slide)
- Flanagan, M. (2009). Critical Play: Radical Game Design. MIT Press.
- Petridis, S., Bazhydai, M., Kinzler, K. D., & Ahl, R. E. (2023). Interrogating AI: Characterizing Emergent Playful Interactions with ChatGPT. CHI EA '23: Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems.
- Palisade Research (2025). Playing chess against a stronger opponent can trigger frontier AI agents to cheat. TIME Magazine. Article
- Nightly-Knight (Chess.com). Playing chess against ChatGPT | It is a cheater! Blog post
- Acher, M. (2024). Debunking the Chessboard: Confronting GPTs Against Chess Engines. Research blog
- r/ChatGPT community discussions on playful AI interactions
This project is licensed under the MIT License - see the LICENSE file for details.
This workshop is part of ongoing research into critical AI pedagogy. Feedback and contributions are welcome via issues or pull requests.
For questions or workshop facilitation inquiries, contact via the Teaching and Learning Center or Interactive Technology & Pedagogy Lab.