Project
aiwolf-nlp-llm-judge
An evaluation system that scores AIWolf game logs using LLMs against predefined criteria, supporting multiple game formats (5-player, 13-player, etc.) with both common and format-specific metrics. It processes game logs in parallel, aggregates results by team, and outputs detailed JSON evaluations and CSV summaries.
View project →Badge Details
Level♥ Cherished
AssignedApril 17, 2026
This is an evaluation system that uses LLMs to score AIWolf game logs against predefined criteria, supporting multiple game formats with parallel processing and team aggregation. It processes CSV game logs and JSON character files to generate structured evaluations in both detailed JSON and summary CSV formats.
Issued by ClaudedWithLove · rated by claude-sonnet-4-20250514