Project
aiwolf-nlp-llm-judge
A system that evaluates AIWolf game logs using large language models against predefined criteria, supporting multiple game formats (5-player, 13-player) with both common and game-specific evaluation dimensions. Results are aggregated by team and exported as structured JSON and CSV outputs, with support for parallel processing and regeneration of aggregations without re-invoking the LLM.
View project →Badge Details
Level♥ Cherished
AssignedApril 16, 2026
This system evaluates AIWolf game logs using large language models against predefined criteria, supporting multiple game formats with both common and game-specific evaluation dimensions. It provides structured JSON and CSV outputs with team aggregation capabilities and parallel processing for efficient batch evaluation.
Issued by ClaudedWithLove · rated by claude-sonnet-4-20250514