Li's Bench¶
A drill platform I built for myself while preparing for research-engineer interviews at frontier AI labs. Single-user. Opinionated. Daily-driver.
This site documents how I use it, surface by surface — not a manual for the public. The bench is invite-only alpha; this is a record of the shape I've grown into, both for future-me and for the agents that work this codebase with me.
What it looks like in a day¶
Morning → / drill picks for today (one reading, one problem, sometimes a writing prompt)
↓
/problem/<slug> submit, get a structured review (in Hamming's voice, or whoever)
↓
Mid-week → /focus a glance at active initiatives: capstone, bench dev, content pipeline
↓
/news what landed overnight from Anthropic, Apollo, METR, Alignment Forum, the labs
↓
Saturday → /reflect Hamming-mode prompts auto-generated against the week's actual data
↓
/writings draft something from what surfaced; sometimes ship to Substack
Every surface in that chain has a page in How I use it explaining what I do there, why I built it that shape, and what I'd change if I were starting again.
Why I built it (briefly)¶
I noticed I was thrashing — three browser tabs of practice problems, a journal in Notion, a separate doc of reading notes, half-remembered Substack threads about interpretability, a Linear board for the capstone. None of it connected. None of it remembered what I'd already worked on.
The bench is the consolidation. Practice and reading and reflection and decision-tracking all read from and write to the same store. The scheduler knows what I attempted yesterday and proposes the next thing accordingly. Reflect prompts cite specific numbers from my week ("you skipped Plan on 4 of 6 attempts — is that signal or noise?"). The MCP server lets a Claude chat see what initiative I'm pushing on and write its decision back to /decisions instead of being lost in a chat scroll.
Map of the rest¶
-
:material-target: How I use it
Feature-by-feature tour. Drilling problems, picking voices, weekly reflections, reading library, JDs, readiness scoring.
-
:material-source-branch: Architecture
Code tour, data model, services. For when I'm extending it (or when an agent needs context to extend it).
-
:material-tools: Development
Setup, common tasks, deploy, autonomous coding agents. Skim if you're new to the repo; reference if you're touching infra.
-
:material-map: Roadmap & decisions
What I've shipped, what's queued, what I rejected and why.
One read¶
If you read one thing from this site, make it Drilling problems. It's the surface I touch most often, and the shape everything else accommodates.