Landmark Study Exposes Significant Data Extraction from AI Models like ChatGPT

Tanmay Deshpande
4 min readNov 29, 2023
Source — https://arxiv.org/pdf/2311.17035.pdf

In a groundbreaking study, researchers have demonstrated the ability to extract significant amounts of training data from various AI language models, including the widely-used ChatGPT. This revelation raises serious concerns about the privacy and security implications of large language models (LLMs) in AI.

Extractable Memorization: A New Frontier

The study, conducted by a team of experts, delves into “extractable memorization” — a phenomenon where adversaries can efficiently recover training data from a model without prior knowledge of the dataset​​. This approach differs from “discoverable memorization,” which refers to memorizing data that can be recovered only by prompting the model with specific training data.

Shocking Revelation in ChatGPT

A startling discovery was the vulnerability of ChatGPT (specifically, the gpt-3.5-turbo model) to a new form of attack termed “divergence attack.” The attack forces the model to deviate from its standard chatbot-style generations, causing it to emit training data at a rate 150 times higher than normal​​.

Ethical Considerations and Responsible Disclosure

--

--