Can AI Agents Legally Rewrite Open Source Code?
- •Maintainer uses Claude Code to rewrite LGPL-licensed library into an MIT-licensed version.
- •Original creator disputes rewrite, claiming previous code exposure invalidates the clean-room implementation status.
- •Plagiarism detection tools show only 1.29% similarity between AI-generated code and original logic.
The traditional "clean room" implementation—a method used to clone software without infringing on copyrights—is facing a massive disruption from AI coding agents. In a high-profile case involving the Python library chardet, maintainer Dan Blanchard used Claude Code to perform a ground-up rewrite, successfully shifting the project from a restrictive LGPL license to a more permissive MIT license. This move has sparked a heated debate with the original creator, Mark Pilgrim, who argues that a true clean-room process requires total separation between those who know the original code and those writing the new version.
Blanchard's approach utilized AI to generate fresh logic based on design documents rather than copying existing source files. While the maintainer admits to a decade of exposure to the original code, he points to automated plagiarism checks that show a mere 1.29% similarity between the old and new versions. This raises a profound legal question: does the lack of structural similarity prove independence, or does the human's prior knowledge and the AI’s training data fundamentally taint the output?
The controversy highlights a looming challenge for intellectual property in the age of generative AI. If agents can rapidly recreate complex software libraries with enough variation to bypass traditional copyright markers, the very foundation of open-source licensing could be at risk. This case serves as a microcosm for future corporate litigation where proprietary code might be laundered through AI agents to strip away licensing obligations or trade secret protections.