The New York Times says OpenAI deleted potential evidence for lawsuits

The New York Times says OpenAI deleted potential evidence for lawsuits

Lawsuits never are exactly a love fest, but the copyright battle between The New York Times and OpenAI and Microsoft is getting particularly contentious. This week, the Times claimed that OpenAI engineers had inadvertently deleted data that the paper’s team had spent more than 150 hours extracting as potential evidence.

OpenAI was able to recover much of the data, but the Times’ legal team says the original file names and folder structure are still missing. According to an affidavit filed in court Wednesday by Jennifer B. Meisel, the paper’s attorney, said that means the information “cannot be used to determine where the plaintiffs’ copied news articles” may have been included in OpenAI’s AI models.

The Times filed a copyright lawsuit against OpenAI and Microsoft last year, claiming the companies illegally used his articles to train artificial intelligence tools like ChatGPT. The case is one of many ongoing legal battles between AI companies and publishers, including a similar suit filed by the Daily News, which is being handled by some of the same lawyers.

The Times’ case is currently in discovery, meaning both sides are turning over requested documents and information that could become evidence. As part of the lawsuit, OpenAI was required by the court to show the Times its training data, which is a big deal — OpenAI has never publicly disclosed exactly what information was used to build its AI models. To uncover it, OpenAI created what the court called a “sandbox” of two “virtual machines” that the Times’ lawyers could sift through. In his statement, Meisel said that OpenAI engineers “deleted” data organized by the Times team on one of those machines.

According to Maisel’s filing, OpenAI acknowledged that the information had been deleted and attempted to address the issue shortly after being alerted to it earlier this month. But when the newspaper’s lawyers looked at the “recovered” data, it was too disorganized, forcing them to “recreate their work from scratch, using significant man-hours and computer processing time,” several other Times lawyers said in a letter sent to the judge the same day like Meisel’s declaration.

The lawyers noted that they had “no reason to believe” that the deletion was “intentional”. In emails submitted as evidence with Meisel’s letter, OpenAI counsel Tom Gorman cited the deletion of the data as a “problem.”

Leave a Reply

Your email address will not be published. Required fields are marked *