Latest•March 11, 2026
Testing the Tester: What the write-tests and sqlite-store Evals Reveal
The gptme practical eval suite grew from 21 to 26 tests this week. Two of the new additions — write-tests and sqlite-store — form an interesting inverse pair: one asks the agent to write tests for given code, the other...
Read post