Player and grader should be different models to avoid self-bias.
Settings
Use LLM player
Auto-evaluate
Actions
Stats
0
Turns
--
Avg ms
0
Tokens
Export
More
Select a story and click Run to begin.
Evaluation
Run a test with "Auto-evaluate" enabled to see grader results here.
Scores Leaderboard
Loading scores...
Test Cases
Select or create a test case to edit it.
Advanced Options
Agent Behavior
Stop on game end
When enabled, the agent stops as soon as the story reaches an end condition
and shows a victory screen. Disable to continue playing after game-over signals.