Terminal-Bench 2
AgenticSecond version of the Terminal-Bench benchmark with expanded task coverage and difficulty.
Models Tested
0
Scoring: percentage
Introduced: 2025-06
No model scores recorded for this benchmark yet.
Second version of the Terminal-Bench benchmark with expanded task coverage and difficulty.