LLMs Can Unmask Pseudonymous Users at Scale with Up to 90% Precision

Available in: 中文

2026-03-22T12:07:18.000Z·2 min read

Research shows LLMs can deanonymize pseudonymous users across platforms with 90% precision and 68% recall, threatening the fundamental assumption that pseudonymity provides adequate online privacy.

LLMs Can Unmask Pseudonymous Users at Scale with Up to 90% Precision

Researchers have demonstrated that large language models can deanonymize pseudonymous users across multiple social media platforms with alarming accuracy — up to 90% precision and 68% recall. The findings, published in a peer-reviewed paper, represent a fundamental threat to online privacy as we know it.

The Research

The study tested LLMs' ability to match posts across platforms (e.g., linking a Hacker News account to a LinkedIn profile) by analyzing writing style, topic preferences, and behavioral patterns:

Precision: Up to 90% of deanonymization guesses were correct
Recall: 68% of pseudonymous users were successfully identified
Scale: The approach works at population-level, not just targeted individuals
Cost: The process is fast and cheap compared to traditional investigation methods

Why This Changes Everything

Online pseudonymity has long been an "implicit threat model" — people assumed that while total anonymity is impossible, pseudonymity provided adequate protection because targeted deanonymization would require extensive effort. LLMs invalidate this assumption:

Doxxing at scale: Mass identification of anonymous commenters, reviewers, and whistleblowers
Stalking: Linking public but pseudonymous accounts to real identities
Marketing surveillance: Building detailed consumer profiles from scattered anonymous data
Authoritarian risk: Governments could use this to identify dissidents and activists

Technical Approach

The researchers' framework works by:

Cross-platform data collection: Gathering posts from multiple platforms
Writing style analysis: LLMs analyze syntax, vocabulary, sentence structure, and topic patterns
Behavioral fingerprinting: Posting times, interaction patterns, topic preferences
Statistical correlation: Matching behavioral and stylistic features across accounts

Implications for Platforms

Social media platforms and forums face difficult choices:

Stronger anonymity tools: More aggressive identity separation between accounts
LLM-resistant design: Adding noise or style randomization to posts
Policy responses: Updating terms of service regarding cross-platform deanonymization

What Users Can Do

While no method is foolproof against determined LLM-based analysis:

Use different writing styles across platforms
Avoid sharing personal details that could serve as linkage points
Consider browser-level protections against cross-site tracking

Source: Ars Technica | Research Paper

↗ Original source

Comments0

LLMs Can Unmask Pseudonymous Users at Scale with Up to 90% Precision