The New York Times legal problem deepens as OpenAI accidentally deletes important data

OpenAI accidentally deleted crucial data related to its copyright lawsuit with The New York Times during ongoing legal proceedings over copyright infringement claims. The incident involved data from dedicated virtual machines provided to the plaintiffs, which OpenAI acknowledged to the court in a recent filing. As a result, attorneys for the Times have stated they lost a week’s worth of work related to the case.

OpenAI faces data loss setback in lawsuit with The New York Times

According to a letter from the Times’ legal team, this data loss involved “an entire week’s worth of its experts’ and lawyers’ work” and was “irretrievably lost.” The plaintiffs were investigating claims that OpenAI’s models had been trained on unauthorized content. As part of this process, they accumulated data over 150 hours of intensive research on OpenAI’s training datasets, specifically looking for instances of copyright infringement. A report from TechCrunch indicated that the deletion occurred on November 14, when “programs and search result data stored on one of the dedicated virtual machines was erased by OpenAI engineers.”

The core of the lawsuit asserts that OpenAI, along with Microsoft—its partner using OpenAI’s technology for its Bing AI chatbot—has infringed The New York Times’ copyright by utilizing paywalled content without authorization. The Times claims OpenAI’s models produced “near-verbatim” replicas of its articles, forming its argument for damages. OpenAI has consistently refuted these allegations, claiming that its training was based on publicly available data, qualifying as fair use under copyright laws.

A spokesperson for OpenAI commented that the incident was a “glitch.” At the same time, they successfully recovered most of the deleted data, and critical elements, including “the folder structure and file names,” remain lost and consequently unusable. As a result, the Times’ attorneys now face the challenge of restarting their evidence collection from the ground up. Despite the circumstances, they reported having “no reason to believe [the erasure] was intentional,” stressing that OpenAI is best positioned to search its datasets. Yet, they also noted the company’s reluctance to disclose details about its training data.

The NewYork Times legal problem deepens as OpenAI accidentally deletes important data — **The New York Times has reportedly invested over $1 million in legal fees to pursue its case against OpenAI** (Screenshot: New York Times)

Further complicating matters, similar copyright claims have emerged against OpenAI. A recent lawsuit against the company by Raw Story and AlterNet was dismissed because the plaintiffs could not provide sufficient evidence of harm related to their allegations. In contrast, The New York Times has reportedly invested over $1 million in legal fees to pursue its case against OpenAI. This financial commitment illustrates smaller publishers’ distinct challenge when vying against substantial-tech companies.

OpenAI, on the other hand, has recently entered licensing agreements with several major media companies, allowing the use of their content to train its AI models, thereby providing compensation and credit. Reports indicate OpenAI is paying publishing giant Dotdash Meredith at least $16 million annually for licensing rights, reflecting its strategy of seeking formal partnerships rather than ongoing litigation.

Image credit: Furkan Demirkaya/Ideogram

The New York Times legal problem deepens as OpenAI accidentally deletes important data

OpenAI deleted vital data in its copyright case with The New York Times, complicating claims of using paywalled content to train its AI models.

Bünyamin Furkan Demirkaya

Related Posts

Microsoft’s AI chip shopping spree leaves rivals in the dust

AI will transform filmmaking but human actors can still play their part

OpenAI launches real-time video features for ChatGPT

Google’s Gemini 2.0 is here: Multimodal and mighty

LATEST

Walmart’s new gaming site aims to fill the game informer void

Apple goes Pixel style? iPhone 17 Pro Max features bold camera redesign

Microsoft’s AI chip shopping spree leaves rivals in the dust

Samsung Galaxy S25 leaks are more of the same

Pokémon Go holiday event adds Festive Dedenne and big rewards

Windows 11 adds a game-changing webcam feature

Nvidia stock falls hard: Are thermal chip issues just the start?

Blackmagic’s $30K camera promises Vision Pro like you’ve never seen

AI will transform filmmaking but human actors can still play their part

Nintendo Switch 2 leaks reveal design and performance upgrades

© 2021 TechBriefly is a Linkmedya brand.

The New York Times legal problem deepens as OpenAI accidentally deletes important data

OpenAI deleted vital data in its copyright case with The New York Times, complicating claims of using paywalled content to train its AI models.

OpenAI faces data loss setback in lawsuit with The New York Times

Related Posts

LATEST

© 2021 TechBriefly is a Linkmedya brand.

Follow Us