Microsoft, OpenAI sued by New York Times over copyright infringement

The New York Times' lawsuit against OpenAI and Microsoft comes amid several copyright infringement lawsuits against makers of AI tools

The New York Times on Wednesday filed a lawsuit in federal court against OpenAI and Microsoft, alleging that the companies used the Times’ content to train artificial intelligence (AI) models without permission, infringing the outlet’s copyrights in the process.

The lawsuit, filed in the Southern District of New York, claims that OpenAI, the maker of generative AI chatbot ChatGPT, and its financial backer Microsoft infringed the Times’ copyrights by building training datasets containing millions of copies of its copyrighted content. The outlet also claims that its copyrights were violated by the output of generative AI tools like ChatGPT.

"Defendants’ generative artificial intelligence ("GenAI") rely on large-language models ("LLMs") that were built by copying and using millions of The Times’ copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides and more," the complaint states.


New York Times building NYC New York City

The New York Times is suing OpenAI and Microsoft over alleged copyright infringement. (Beata Zawrzel/NurPhoto via / Getty Images)

"Through Microsoft’s Bing Chat (recently rebranded as "Copilot") and OpenAI’s ChatGPT, Defendants seek to free-ride on The Times’ massive investment in its journalism by using it to build substitutive products without permission or payment," the lawsuit added.


A spokesperson for OpenAI told FOX Business, ""We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from AI technology and new revenue models. Our ongoing conversations with the New York Times have been productive and moving forward constructively, so we are surprised and disappointed with this development. We’re hopeful that we will find a mutually beneficial way to work together, as we are doing with many other publishers."

Microsoft did not immediately respond to a request for comment.

ChatGPT illustration

OpenAI, the creator of ChatGPT, has acknowledged using copyrighted content to train its AI models. (  / Reuters Photos)

The Times’ complaint is the latest of several copyright infringement lawsuits against companies building and marketing AI tools.

A group of authors including John Grisham and "Game of Thrones" writer George R.R. Martin filed a lawsuit against OpenAI, and noted in late November that they planned to add Microsoft to the suit as a defendant, given the tech giant’s ties to OpenAI.


Microsoft - Open AI

Microsoft has invested in OpenAI, and the two companies enjoy a close relationship. (Jonathan Raa/NurPhoto via / Getty Images)

The lawsuit involving Martin and Grisham was filed by the Authors Guild in the same court as the Times' suit in the Southern District of New York. A similar lawsuit was filed against OpenAI by nonfiction author Julian Sancton in that court.

Separately, comedian Sarah Silverman and another group of authors sued Meta, the parent company of Facebook and Instagram, over its use of their copyrighted content in training at least one of its AI models.

OpenAI has previously suggested that its use of copyrighted materials to train AI models is protected under what's known as the fair use doctrine. A report by the Congressional Research Service (CRS) updated in August noted that OpenAI acknowledged in a document filed with the U.S. Patent and Trademark Office that its programs are trained with the "use of large, publicly available datasets that include copyrighted works."


The company argued that those original works are used in a "highly transformative" way to train the systems on the patterns of human-generated content so that it can provide more useful AI-generated content, which is different from the content it was trained on.

The CRS noted that whether the copyrighted content can be used under the fair use doctrine depends on four legal factors, including whether the use is for commercial or nonprofit educational purposes, the nature of the copyrighted work, the amount and substantiality of the copyrighted work used and the effect of its use on the potential market for or value of the copyrighted work.