BudgetBench | Aayush Kumar

🎯The Problem

Models expose reasoning budget knobs, but we don't know which tasks benefit or how to allocate budget optimally.

💡The Solution

BudgetBench protocol for standardized evaluation. Adaptive Budget Controller that escalates budget only when needed. Reproducible pipeline using open weights and datasets.

✨Key Highlights

Accuracy–cost Pareto curves
Adaptive budget escalation
$0 reproducible pipeline

Other Projects

HyperSentry

An agentic security copilot that uses policy checks + retrieval to generate remediation PRs and accelerate incident analysis.

SchemaPulse

OpenAI-compatible inference layer for schema-valid structured outputs with streaming tool-call parsing.

EdgeJury

Multi-LLM council with cross-review for truthful QA on serverless edge inference.