Tips for getting coding agents to write good Python tests
- •Simon Willison outlines effective strategies for improving AI-generated Python test quality.
- •Developers should guide agents to use specific libraries like pytest-httpx and refactor using fixtures.
- •Pattern imitation from existing high-quality repositories significantly enhances autonomous agent output.
Simon Willison, a prominent technologist and co-creator of Django, recently shared practical insights on optimizing the output of autonomous coding agents when generating Python tests. He notes that the abundance of high-quality Python code in training datasets already gives these models a strong foundation, particularly when using popular frameworks like pytest. By directing agents toward specific third-party tools such as pytest-httpx for mocking external interactions, developers can move beyond generic boilerplate to production-grade testing suites. A critical observation Willison makes is the tendency for AI to produce repetitive setup code. Rather than accepting this technical debt, he recommends explicitly prompting the agent to refactor logic using fixtures—reusable components that set up a specific environment—and parametrization, which allows the same test to run with various data inputs. This interactive refinement process ensures the resulting test suite remains maintainable and efficient over time. Perhaps the most effective strategy highlighted is providing a "gold standard" for the agent to follow. By pointing an agent to an existing repository and instructing it to imitate those specific testing patterns, developers bypass the need for complex, manual prompt engineering. This method leverages the model's ability to recognize and replicate architectural styles, effectively aligning the agent’s output with the developer's personal or organizational coding standards without requiring excessive instruction.