ncaa26

♥Cherished

A predictive modeling pipeline that forecasts NCAA Division I basketball tournament outcomes by building custom team ratings directly from raw box scores and game data. Uses XGBoost with calibrated probability outputs to compete in the Kaggle March Machine Learning Mania competition and ESPN bracket pools, achieving top-20% performance without relying on external rating systems.

by kevniu95

·★ 0··submitted April 19, 2026

View on GitHub

Clauded With Love Rating

8.1 / 10

A comprehensive NCAA basketball prediction system that builds custom team ratings from raw box scores to compete in Kaggle and ESPN tournaments, achieving top-20% performance. The project demonstrates sophisticated ML pipeline engineering with XGBoost modeling, probability calibration, and systematic validation against established rating systems like KenPom.

Code Quality7.8

Usefulness8.2

Claude Usage8.7

Documentation8.5

Originality7.4

Highlights

✓Agent-driven development workflow with spec-first orchestration and per-feature branching demonstrates advanced use of Claude Code for complex project management
✓Custom rating system built entirely from raw box scores that actually beats KenPom baseline (0.1920 vs 0.1927 Brier score) shows genuine technical achievement
✓Comprehensive feature engineering pipeline including ELO ratings, SRS calculations, four factors, momentum stats, and tournament travel distance creates a rich predictive model

To Improve

→Add comprehensive unit tests and integration tests for the rating algorithms and feature computation modules to ensure reliability across seasons
→Implement proper error handling and logging throughout the pipeline scripts to make debugging and monitoring easier in production scenarios

Topic

AI/ML Analytics Scraping Data Pipeline

Language

Python