Simon Willison Addresses AI Writing Accusations
- •Simon Willison refutes accusations of using generative AI models to write his personal blog content.
- •The perceived 'LLM smell' is actually a result of legacy Python code for em dash formatting.
- •The incident demonstrates the risks of misidentifying human-authored content as synthetic based on stylistic patterns.
Simon Willison, a well-known technologist and co-creator of the Django web framework, has addressed recurring accusations that he uses Large Language Models (LLMs) to write his blog content. The critique often centers on what is colloquially known as "LLM smell"—a set of stylistic tics, formatting choices, or structural patterns that readers frequently associate with AI-generated text. In Willison’s case, the primary "offense" is his frequent and precise use of em dashes, a punctuation mark often utilized by AI models to create complex, multi-clause sentences.
The reality, however, is far more mundane and predates the modern AI era. Willison revealed that his use of em dashes is the result of a specific Python code snippet he implemented in 2015. This script automatically scans his blog posts and replaces standard hyphens surrounded by spaces with a formal em dash. This automation was part of a migration to GitHub nearly a decade ago, long before ChatGPT or similar models became household names. This highlights a significant challenge in the current digital landscape: the tendency to retroactively label human habits as AI-generated artifacts.
This incident serves as a critical case study in the difficulty of AI detection based purely on stylistic "vibes." As users become more attuned to AI patterns, they risk misidentifying genuine human creativity and personal automation as synthetic output. Willison’s situation suggests that while AI models are trained on human data, humans also employ tools and scripts that mimic the structured consistency of machines. Ultimately, the presence of a specific punctuation mark or formatting style is a poor proxy for determining the origin of a piece of writing.