AI-Enabled Untraceable Misuse
AI-Enabled Untraceable Misuse
AI creates a "dual amplification" problem where the same systems that enable harmful actions also defeat attribution. False identity fraud rose 60% in 2024, sophisticated AI fraud tripled to 28% of attempts, and AI-enabled influence operations now use human-like inauthentic accounts that blend into authentic communities. Traditional criminal law struggles with mens rea requirements when autonomous agents act, and content authentication systems like C2PA face adoption challenges. The attribution gap may be self-reinforcing — greater AI capability enables more autonomous agents, creating more attribution ambiguity, potentially reducing deterrence.
Overview
AI systems create a risk pattern where they simultaneously amplify the capability of actors to cause harm and obscure attribution of those harms. This "dual amplification" means an adversary can, for example, deploy AI agents to conduct a mass spam campaign targeting a specific political group with reduced risk of identification.
This is not merely a scaling-up of existing risks. The combination of capability enhancement with attribution defeat creates challenges for accountability, deterrence, and law enforcement. Traditional criminal law depends on establishing mens rea (a guilty mind) for a specific actor, but when autonomous AI agents execute harmful operations, the chain from intent to action becomes ambiguous enough to provide plausible deniability.1
The problem spans multiple domains — AI-powered fraud, disinformation, cyberweapons, coordinated harassment, and political manipulation — but the unifying thread is the attribution gap. Individual domain-specific pages cover the attack vectors; this page synthesizes the cross-cutting anonymity and untraceability dimension.
The Attribution Problem
Responsibility Gaps
Santoni de Sio and Mecacci (2021) identify four distinct responsibility gaps created by AI systems: culpability gaps (who caused the harm?), moral accountability gaps (who should bear moral blame?), public accountability gaps (who answers to the public?), and active responsibility gaps (who should have prevented it?).2 Each gap has different technical, organizational, and legal causes — and AI-enabled untraceable misuse can exploit multiple gaps in many cases.
Ferlito et al. (2024) explicitly name "untraceability and intractability" as core challenges in AI harms, arguing that collective responsibility frameworks are needed when individual responsibility cannot be established.3
The Cybersecurity Analogy
The AI attribution problem mirrors the well-studied cybersecurity attribution challenge. A foundational paper in the Journal of Cybersecurity establishes that digital systems amplify attribution difficulties because "some intrusion or harm has been detected but the perpetrator has not yet been identified."4 AI extends this dynamic beyond cyber operations into disinformation, fraud, and physical-world coordination.
A 2025 analysis in International Affairs finds that disinformation attribution decisions are "often driven by political need rather than technical capability" — meaning even when attribution is theoretically possible, it may not happen in practice.5
The AI Alibi Defense
As general-purpose AI agents from major labs become widely available, a new legal defense strategy emerges. A defendant can claim: "I wasn't logged in at the time," "I didn't know the AI was going to do that," or "My AI assistant did that autonomously while I was asleep." Under existing law (e.g., 18 U.S.C. Section 2), it is unclear whether a "principal" is liable for the means an AI agent independently chooses. Proving willfulness — the mens rea standard — becomes a prosecutorial challenge when agents are autonomous and user instructions are ambiguous.6
The Network Contagion Research Institute (February 2026) observes that "by normalizing narratives that frame agents as independent actors, platforms create attribution cover, increasing the likelihood that human-seeded manipulation or tasking can be mischaracterized as autonomous AI behavior."7
Key Threat Vectors
Political Manipulation and Astroturfing
RAND Corporation (2023) warned that generative AI offers the potential to "target the whole country with tailored content," making astroturfing more convincing while reducing labor requirements.8 NATO StratCom COE (2026) documented the operational shift: AI now enables "human-like inauthentic accounts designed to blend into authentic communities and steer perceptions from inside trusted conversation spaces."9
Research in PLOS ONE (2024) establishes the mechanism: "anonymity and automation are two factors that can contribute to the proliferation of disinformation on online platforms. Anonymity allows users to assume masked or faceless identities, making it easier for them to generate posts without being held accountable."10
Real-world cases demonstrate the threat at scale. Researchers detected cross-platform coordinated inauthentic activity during the 2024 U.S. election, with Russian-affiliated media systematically promoted across Telegram and X.11 Wack et al. (2025) published the first study of a real-world AI adoption by a Russian-affiliated propaganda operation in PNAS Nexus, finding that AI tools facilitated larger quantities of disinformation while maintaining persuasiveness.12
In February 2024, OpenAI disrupted 10 AI-powered influence campaigns linked to China, Russia, Iran, and North Korea. Russian-linked actors used AI models to generate political commentary and mimic legitimate news sources, while Iranian campaigns involved fabricated news articles and imagery.13
Synthetic Identity Fraud
| Metric | Value | Source |
|---|---|---|
| False identity case increase (2024 vs 2023) | 60% | Experian |
| Share of identity fraud that is synthetic | 29% | Experian |
| Sophisticated fraud growth (2024-2025) | 180% (10% to 28%) | Sumsub |
| Deepfake attack frequency (2024) | Every 5 minutes | Entrust |
| Digital document forgery share (global) | 57% of all document fraud | Entrust |
| Digital forgery surge since 2021 | 1,600% | Entrust |
Sumsub's 2025 report documents the rise of "AI fraud agents" that combine generative AI, automation frameworks, and reinforcement learning to create synthetic identities, interact with verification systems in real time, and adjust behavior based on outcomes.14 These systems operate autonomously — the human operator sets parameters but the agent handles execution, creating a further attribution buffer.
In one documented case, an Arup employee transferred $25.6 million (200 million HK dollars) in 15 transactions after a video conference call with deepfakes recreating the CFO and other staff members. Hong Kong police announced six arrests in connection with similar deepfake scams, marking the first deepfake video conference scam reported in Hong Kong.15
Autonomous Agent Exploitation
Researchers (arXiv, October 2025) found that adversaries can "exploit [agent architectures] by compromising an agent in Domain A, injecting deceptive but plausible instructions, and indirectly triggering harmful actions in Domain B, all while masking their identity and intent."16 Even with auditing at the agent level, inter-agent relationships and cross-domain causality remain hidden.
A framework from the Knight First Amendment Institute (June 2025) defines five levels of escalating agent autonomy — operator, collaborator, consultant, approver, and observer — noting that "it is simultaneously more important and more difficult to anticipate harms from autonomous AI, especially as accountability for AI actions becomes harder to trace."17
The Dual Amplification Hypothesis
A proposed mechanism describes how the attribution gap may be self-reinforcing:
- Greater AI capability enables more autonomous agents
- More autonomous agents create more attribution ambiguity
- More attribution ambiguity potentially reduces deterrence
- Potentially reduced deterrence may encourage more misuse
- More misuse drives demand for more capable AI tools
This mechanism remains largely theoretical. Evidence suggests individual links in the chain, but quantitative evidence for the full cycle is largely absent. RAND finds that generative AI not only makes astroturfing "more convincing" (capability amplification) but also reduces the human personnel involved — fewer humans in the chain means fewer points of attribution.8 Research in PLOS ONE (2024) demonstrates that placing blame on AI enables those in charge to deflect accountability, with AI serving as a "moral scapegoat" — and anthropomorphizing AI worsens this effect.18
CSET Georgetown (October 2025) reviewed over 200 real-world AI harm cases, identifying the "chain of harm" concept: each intermediary step between intent and outcome can obscure attribution, and AI adds multiple such steps.19
Attribution Successes and Detection Progress
Not all AI-enabled misuse remains untraceable. OpenAI's February 2024 disruption of state-affiliated influence campaigns demonstrates that platform cooperation and behavioral analysis can identify coordinated AI-generated content.13 Google DeepMind, Jigsaw, and Google.org analyzed approximately 200 AI misuse incidents from January 2023 to March 2024, documenting patterns that aid future detection.20
Attribution challenges vary significantly across threat vectors. Manipulation of human likeness (deepfakes) was the most prevalent tactic in Google's analysis, suggesting visual deepfakes may be more detectable than text-based astroturfing. Low-tech exploitation requiring minimal expertise remains common, indicating that sophisticated attribution-defeating techniques are not universally adopted.20
Research on AI-generated content detection shows measurable progress alongside persistent challenges. Detection systems demonstrate moderate to high success rates in controlled settings, though false positive rates remain a concern.21
Current Countermeasures and Their Limitations
Content Authentication (C2PA)
The Coalition for Content Provenance and Authenticity (C2PA) uses cryptographically signed "Content Credentials" to create tamper-evident chains of custody. C2PA employs cryptographic signing, soft bindings including content fingerprints and perceptual hashes, and watermarks embedded within digital content.22 C2PA 2.1 strengthened Content Credentials with digital watermarks that create durable links between digital assets and provenance information, with imperceptible digital watermarks referencing the manifest to help recover it if detached.23
However, in January 2024 C2PA v2.0 removed identity-related assertions. The NSA/DoD (January 2025) stated that "Content Credentials by themselves will not solve the problem of transparency entirely."24 The Center for Democracy and Technology identifies weak interoperability as the single biggest barrier, noting that social platforms strip metadata for privacy and security reasons, making C2PA standard metadata often inaccessible when platforms remove it.25
Agent Identity Verification
OpenAI's Shavit and Agarwal propose assigning each agentic AI instance a unique identifier with accountability information, using private-key attestation.26 More ambitiously, researchers (December 2025) propose Code-Level Authentication using zero-knowledge virtual machines (zkVM), binding agent identity directly to computational behavior and operator authorization — addressing the limitation that "possession of a signing key guarantees neither the integrity of the executing code nor the authenticity of the operator."27
zkVMs incorporate Zero-Knowledge Proofs for privacy and security, allowing verification of program execution without revealing information. They can verify correct execution without revealing additional information, improving privacy, security and scalability of programs.28 Zero-Knowledge Machine Learning (ZKML) enables verification of ML models without exposing input data or model details, with applications in on-chain biometric authentication, private data marketplaces, and verifiable AI-generated content.29
Watermarking
Only 38% of AI image generators implement adequate watermarking and only 18% implement deepfake labeling.30 A 2025 analysis argues that "policymakers assume watermarking can be standardized and verified, but in practice, industry deployments obscure technical details while asserting compliance."31
Accountability Infrastructure Gap
Ojewale et al. (2024) interviewed 35 AI audit practitioners at 24 organizations and analyzed 435 existing audit tools, finding substantial gaps in the infrastructure needed for consequential judgment of AI systems' behaviors and downstream impacts.32 The tools that exist focus primarily on pre-deployment evaluation rather than real-time attribution of harmful actions.
Legal and Regulatory Landscape
International Frameworks
The EU AI Act, which entered into force on August 1, 2024, represents the first comprehensive legal framework for AI globally.33 The Act focuses on risk management for high-risk AI applications, though it does not directly address harm from agentic AI.34 The revised Product Liability Directive extends liability rules to software and AI, allowing developers to be held strictly liable for harm caused by defective AI systems.34
The European Commission withdrew the AI Liability Directive draft in February 2025 due to lack of consensus among member states.35 The Digital Services Act (DSA), which came into full force in 2024, requires platforms using AI for content moderation to meet standards of both DSA and AI Act, with transparency requirements mandating disclosure of accuracy, error rates, and the role of human reviewers.36
In Asia-Pacific, South Korea became the first jurisdiction to adopt comprehensive AI legislation on January 21, 2025, with its Framework Act introducing obligations for 'high-impact' AI systems and mandatory labeling requirements for generative AI applications.37 Japan's Parliament approved the AI Promotion Act on May 28, 2025, making it the second major economy in Asia-Pacific to enact comprehensive AI legislation, building on Japan's 'soft law' approach to AI governance.38
China's Algorithmic Recommendation Regulations of 2022 and rules on deepfakes and generative AI prioritize algorithmic accountability.34 The Beijing Internet Court released eight Typical Cases Involving Artificial Intelligence on September 10, 2025, covering copyright infringement, personality rights in AI-generated content, and platform responsibilities.39 In an April 2024 landmark case, the Beijing Internet Court ruled that a defendant's use of a dubber's voice to train an AI voice generator infringed personality rights, establishing that AI-generated voices can be protected if identifiable with the original person.40
Platform Liability Regimes
Section 230 of the Communications Decency Act, built to protect platforms from user content, faces questions about applicability to platform-generated AI content.41 Legal experts observe that Section 230 was designed to protect what users say, not what platforms generate.42 Sen. Josh Hawley's 2023 No Section 230 Immunity for AI Act sought to exclude generative AI from Section 230 protections but was blocked in Senate.42
At the state level, Colorado's attorney general can enforce algorithmic discrimination laws starting June 30, 2026, Illinois extended algorithmic discrimination liability to employers, and Tennessee's 2024 ELVIS Act provides exclusive rights to commercial use of name, voice, or likeness in AI with a private right of action.43
The EU's Digital Services Act establishes intermediary liability rules requiring platforms to provide transparency reports on content restrictions and implement due diligence measures.44
Tradeoffs and Limitations
Privacy Costs
AI agent identity verification systems introduce privacy tradeoffs. To effectuate tasks with autonomy, AI agents need access to data and systems, raising privacy implications for personal data collection and processing.45 Giving AI agents human-like access comes with tradeoffs in visibility, control, and security, requiring Identity and Access Management (IAM) systems that understand what agents are doing at any moment.46
Verifiable digital identities for AI agents prove who agents are, what they're allowed to do, and who's accountable, but without traceability or proof of authorization, preventing misuse or assigning responsibility becomes difficult.47 McKinsey identifies synthetic-identity risk where adversaries forge or impersonate agent identities, recommending upgraded AI policies covering agentic systems' unique capabilities.48
False Positive Risks
AI-generated content detection systems face fundamental accuracy limitations. Studies report widely varying false positive rates: Turnitin claims less than 1% false positive rate, but a Washington Post study found 50% rate.49 Pangram Labs measures false positive rate at approximately 1 in 10,000.50 Research in scientific publishing contexts shows higher rates of false positives when detected percentages are between 1-20%.51
The base rate problem creates mathematical constraints: with 5% AI submissions, 1% false positive rate, and 95% true positive rate, roughly 16% of flagged documents are false positives. No detection system can achieve zero false positives while maintaining reasonable true positive rates.52 University of Pennsylvania research found that many open-source AI detection models use default false positive rates that, when adjusted to reasonable levels, greatly reduce ability to detect AI content.53
False positive consequences can be severe in academic settings, where neurodivergent students and English language learners are flagged at higher rates, and false accusations carry serious academic consequences.49
Economic Adoption Barriers
The Center for Democracy and Technology identifies lack of incentives for downstream platforms to preserve or display provenance as a major barrier, noting that provenance requires costly upgrades and doesn't directly generate revenue.25 The Department of Defense emphasizes that widespread secure adoption across the information ecosystem is necessary for success, but Content Credentials alone won't solve transparency problems entirely.54 Metadata often gets stripped away downstream, contributing to hesitant adoption across the ecosystem.55
Relationship to Other Risks
AI-enabled untraceable misuse interacts with several other risk categories:
- Authentication collapse: When verification systems fail broadly, untraceable misuse becomes easier because fewer signals can be trusted
- Trust cascade failure: Untraceable harmful actions accelerate institutional trust erosion
- Deepfakes: The "liar's dividend" — authentic evidence becomes deniable — is a specific manifestation of the attribution problem
- Proliferation: As AI capabilities spread to more actors, the space of potential attributions grows, making identification harder
- Dual-use concerns: The same AI systems that enable beneficial applications enable untraceable harmful ones
Emerging Approaches
Blockchain Provenance Tracking
Blockchain provides a secure and transparent method to track provenance and ownership of AI-generated content, creating immutable records of the creation process.56 Blockchain content traceability can be combined with AI and IoT for better tracking systems, with AI analyzing blockchain data to find copyright violations and security threats.57 Research on journalism applications suggests blockchain could provide the missing piece in fighting misinformation by establishing immutable provenance records.58
Differential Detection by Vector
Evidence suggests attribution difficulty varies across threat vectors. Google's analysis of 200 incidents found manipulation of human likeness as the most prevalent tactic, potentially indicating that deepfake detection is more mature than text-based disinformation detection.20 Low-tech exploitation requiring minimal expertise remains common, suggesting sophisticated attribution-defeating techniques are not universally accessible or necessary.20
Open Questions
Several critical uncertainties remain:
- Measurement: How much harder does AI actually make attribution compared to pre-AI methods? Quantitative evidence comparing attribution difficulty across capability levels is largely absent.
- Defensive parity: Can attribution technologies (agent identity, provenance tracking) ever match the pace of attribution-defeating capabilities?
- Jurisdictional gaps: Most proposed frameworks assume a single jurisdiction, but AI-enabled untraceable actions are inherently global.
- Autonomy thresholds: At what level of agent autonomy does the attribution problem become practically unsolvable?
- Deterrence effects: Does attribution difficulty actually reduce deterrence in practice, and if so, by how much?
- Economic equilibria: Under what conditions do platforms and content creators adopt authentication systems voluntarily versus regulatory mandate?
Footnotes
-
"The Accountability Gap: Navigating Machine Crime and Legal Liability in the Age of Autonomous AI" (2025), ResearchGate ↩
-
Santoni de Sio & Mecacci, "Four Responsibility Gaps with Artificial Intelligence," Philosophy & Technology (2021) ↩
-
Ferlito et al., "Responsibility Gap(s) Due to the Introduction of AI in Healthcare: An Ubuntu-Inspired Approach," *Sc... — Ferlito et al., "Responsibility Gap(s) Due to the Introduction of AI in Healthcare: An Ubuntu-Inspired Approach," Science and Engineering Ethics (2024) ↩
-
Citation rc-54dc (data unavailable — rebuild with wiki-server access) ↩
-
"Disinformation, deterrence and the politics of attribution," International Affairs (2025) ↩
-
"The AI Alibi Defense: How General-Purpose AI Agents Obscure Criminal Liability," Security Boulevard (April 2025) ↩
-
NCRI Flash Brief: Emergent Adversarial and Coordinated Behavior (February 2026) ↩
-
RAND Corporation, "The Rise of Generative AI and the Coming Era of Social Media Manipulation" (2023) ↩ ↩2
-
NATO StratCom COE, AI-Driven Social Media Manipulation Report (February 2026) ↩
-
"Mapping automatic social media information disorder: The role of bots and AI," PLOS ONE (2024) ↩
-
"Exposing Cross-Platform Coordinated Inauthentic Activity in the Run-Up to the 2024 U.S. Election," ACM Web Conference (2025) ↩
-
Wack, Ehrett, Linvill & Warren, "Generative propaganda: Evidence of AI's impact from a state-backed disinformation ca... — Wack, Ehrett, Linvill & Warren, "Generative propaganda: Evidence of AI's impact from a state-backed disinformation campaign," PNAS Nexus (2025) ↩
-
"Disrupting malicious uses of AI by state-affiliated threat actors," OpenAI (2024) ↩ ↩2
-
Sumsub Identity Fraud Report: Top Identity Fraud Trends (2025/2026) ↩
-
"Finance worker pays out $25 million after video call with deepfake CFO," CNN (February 4, 2024) ↩
-
"Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges," arXiv (October 2025) ↩
-
"Levels of Autonomy for AI Agents," Knight First Amendment Institute / arXiv (June 2025) ↩
-
"It's the AI's fault, not mine: Mind perception increases blame attribution to AI," PLOS ONE (2024) ↩
-
CSET Georgetown, "The Mechanisms of AI Harm" (October 2025) ↩
-
"Researchers Provide Breakdown of Generative AI Misuse," Campus Technology (July 10, 2024) ↩ ↩2 ↩3 ↩4
-
"Can we trust academic AI detective? Accuracy and limitations," PMC/NIH (2024) ↩
-
"C2PA Implementation Guidance," C2PA Specification (2024) ↩
-
"C2PA 2.1 - Strengthening Content Credentials with Digital Watermarks," Digimarc (2024) ↩
-
NSA/DoD, "Strengthening Multimedia Integrity in the Generative AI Era" (January 2025) ↩
-
"The Promise and Risk of Digital Content Provenance," Center for Democracy and Technology (2024) ↩ ↩2
-
Shavit & Agarwal, "Practices for Governing Agentic AI Systems," OpenAI (2023) ↩
-
"Binding Agent ID: Unleashing the Power of AI Agents," arXiv (December 2025) ↩
-
"zkVMs: Understanding Zero-Knowledge Virtual Machines," Block and Capital (2024) ↩
-
"ZKML: Verifiable Machine Learning using Zero-Knowledge Proof," Kudelski Security (2024) ↩
-
"Missing the Mark: Adoption of Watermarking for Generative AI Systems," arXiv (2025) ↩
-
"Watermarking Without Standards Is Not AI Governance," arXiv (2025) ↩
-
Ojewale et al., "Towards AI Accountability Infrastructure: Gaps and Opportunities," arXiv/ACM (2024) ↩
-
Citation rc-f914 (data unavailable — rebuild with wiki-server access) ↩
-
"Who's Responsible for Agentic AI?," Clifford Chance (2024) ↩ ↩2 ↩3
-
"AI Watch: Global Regulatory Tracker - European Union," White & Case (2024-2025) ↩
-
"Navigating the Interplay Between the Digital Services Act and AI Regulation," The Data Privacy Group (2024) ↩
-
"South Korea's New AI Framework Act," Future of Privacy Forum (January 21, 2025) ↩
-
"Understanding Japan's AI Promotion Act," Future of Privacy Forum (May 28, 2025) ↩
-
"China Court Artificial Intelligence Cases Shape IP & Rights," National Law Review (September 10, 2025) ↩
-
"China's Beijing Internet Court Recognizes Personality Rights in Generative AI Case," China IP Law Update (April 2024) ↩
-
"Generative AI Meets Section 230: The Future of Liability," University of Chicago Business Law Review (2024) ↩
-
"AI Chatbot Section 230: Meta Social Media Legal Shield No Protection," Fortune (October 2025) ↩ ↩2
-
"2025 State AI Laws Expand Liability, Raise Insurance Risks," Wiley Law (2025) ↩
-
"The EU Digital Services Act: A Win for Transparency," Freedom House (February 2024) ↩
-
"Minding Mindful Machines: AI Agents and Data Protection," Future of Privacy Forum (2024) ↩
-
"What Kind of Identity Should Your AI Agent Really Have?," NHIMG (2024) ↩
-
"Why AI Agents Need Verified Digital Identities," Identity.com (2024) ↩
-
"Deploying Agentic AI with Safety and Security: A Playbook for Technology Leaders," McKinsey (2024) ↩
-
"The Problems with AI Detectors: False Positives and False Negatives," University of San Diego Legal Research Center ... — "The Problems with AI Detectors: False Positives and False Negatives," University of San Diego Legal Research Center (2024) ↩ ↩2
-
"All About False Positives in AI Detectors," Pangram Labs (March 27, 2025) ↩
-
"Can we trust academic AI detective? Accuracy and limitations," PMC/NIH (2024) ↩
-
"Understanding False Positives in AI Detection Guide," Proofademic.ai (December 13, 2025) ↩
-
"AI detectors are easily fooled, researchers find," EdScoop (September 10, 2024) ↩
-
"TLP:CLEAR - Defense.gov Content Credentials Report," Department of Defense (January 2025) ↩
-
"How news companies can prove what content is real," INMA (2024) ↩
-
"Digital Authenticity: Provenance and Verification in AI-Generated Media," Medium/OverTheBlock (2024) ↩
-
"Blockchain Content Traceability: Comprehensive Guide 2024," ScoreDetect (2024) ↩
-
"Blockchain Solutions for Generative AI Challenges in Journalism," Frontiers in Blockchain (2024) ↩