Suchir Balaji spent nearly four years as an artificial intelligence researcher at OpenAI. Among other projects, he helped gather and organize the enormous amounts of internet data the company used to build its online chatbot, ChatGPT.

At the time, he did not carefully consider whether the company had a legal right to build its products in this way. He assumed the San Francisco startup was free to use any internet data, whether it was copyrighted or not.

But after the release of ChatGPT in late 2022, he thought harder about what the company was doing. He came to the conclusion that OpenAI’s use of copyrighted data violated the law and that technologies like ChatGPT were damaging the internet.