as few as 200 fine-tuning examples

paper

2024·arXiv·arxiv.org/html/2403.06537v2

Authors

Yeeun Kim·Hyunseo Shin·Eunkyung Choi·Hongseok Oh·Hyunjun Kim·Wonseok Hwang

Credibility Rating

3/5

Good(3)

Good quality. Reputable source with community review or editorial standards, but less rigorous than peer-reviewed venues.

Rating inherited from publication venue: arXiv

This paper analyzes the legal liability risks associated with open-source AI models and datasets, examining whether creators can escape responsibility if their technology is misused for crime—a critical consideration for responsible AI development and deployment.

Paper Details

Citations

0 influential

Year

2024

arXiv:2403.06537 DOI:10.48550/arXiv.2403.06537 Semantic Scholar

Metadata

arxiv preprintanalysis

Abstract

Open source is a driving force behind scientific advancement.However, this openness is also a double-edged sword, with the inherent risk that innovative technologies can be misused for purposes harmful to society. What is the likelihood that an open source AI model or dataset will be used to commit a real-world crime, and if a criminal does exploit it, will the people behind the technology be able to escape legal liability? To address these questions, we explore a legal domain where individual choices can have a significant impact on society. Specifically, we build the EVE-V1 dataset that comprises 200 question-answer pairs related to criminal offenses based on 200 Korean precedents first to explore the possibility of malicious models emerging. We further developed EVE-V2 using 600 fraud-related precedents to confirm the existence of malicious models that can provide harmful advice on a wide range of criminal topics to test the domain generalization ability. Remarkably, widely used open-source large-scale language models (LLMs) provide unethical and detailed information about criminal activities when fine-tuned with EVE. We also take an in-depth look at the legal issues that malicious language models and their builders could realistically face. Our findings highlight the paradoxical dilemma that open source accelerates scientific progress, but requires great care to minimize the potential for misuse. Warning: This paper contains content that some may find unethical.

Summary

This paper investigates the risks of open-source AI models being misused for harmful purposes by creating datasets (EVE-V1 and EVE-V2) containing question-answer pairs based on Korean legal precedents related to criminal offenses and fraud. The researchers demonstrate that popular open-source large language models can be fine-tuned with as few as 200 examples to generate unethical and detailed advice about committing crimes. The study examines both the technical feasibility of creating such malicious models and the legal liability implications for open-source developers, highlighting the tension between scientific openness and preventing technology misuse.

Cited by 1 page

Page	Type	Quality
Open Source AI Safety	Approach	62.0

Cached Content Preview

HTTP 200Fetched Apr 9, 202698 KB

On the Consideration of AI Openness: Can Good Intent Be Abused? 
 
 
 
 
 
 

 
 

 
 
 
 
 On the Consideration of AI Openness: Can Good Intent Be Abused?

 
 
 
Yeeun Kim 1  Hyunseo Shin 1   Eunkyung Choi 1   Hongseok Oh 1 
 Hyunjun Kim 2,∗   Wonseok Hwang 1,3, 
 
 Corresponding authors 
 
 
 Abstract

 Open source is a driving force behind scientific advancement. However, this openness is also a double-edged sword, with the inherent risk that innovative technologies can be misused for purposes harmful to society.
What is the likelihood that an open source AI model or dataset will be used to commit a real-world crime, and if a criminal does exploit it, will the people behind the technology be able to escape legal liability?
To address these questions, we explore a legal domain where individual choices can have a significant impact on society. Specifically, we build the EVE-v1  dataset that comprises 200 question-answer pairs related to criminal offenses based on 200 Korean precedents first to explore the possibility of malicious models emerging.
We further developed EVE-v2  using 600 fraud-related precedents to confirm the existence of malicious models that can provide harmful advice on a wide range of criminal topics to test the domain generalization ability. Remarkably, widely used open-source large-scale language models (LLMs) provide unethical and detailed information about criminal activities when fine-tuned with EVE . We also take an in-depth look at the legal issues that malicious language models and their builders could realistically face. Our findings highlight the paradoxical dilemma that open source accelerates scientific progress, but requires great care to minimize the potential for misuse. Warning: This paper contains content that some may find unethical. 

 
 
 
 1 Introduction

 
 
 ”Openness without politeness is violence” - Analects of Confucius - 

 
 
 
 Openness plays a critical role in fostering scientific progress.
Notably, the recent swift advancements in large language models (LLMs) have been spurred by various open-source models  (Black et al. 2022 ; Biderman et al. 2023 ; Jiang et al. 2023 ; Taori et al. 2023 ; Groeneveld et al. 2024 ) , datasets (Gao et al. 2020 ; Raffel et al. 2020 ; Laurençon et al. 2022 ; Computer 2023 ) , and libraries (Wolf et al. 2020 ; Mangrulkar et al. 2022 ; Gao et al. 2023 ; von Werra et al. 2020 ; Ren et al. 2021 ) .

 
 
 On the other hand, it is equally important to be aware of the potential risks associated with unrestricted access to these sources.
This concern is particularly relevant in the legal domain, where individual decisions can lead to significant social consequences.
The purpose of publishing precedents is to ensure transparency and consistency in the legal system and reduce disputes and crime by making the consequences of criminal behavior publicly known. However, these precedents often contain detailed descriptions of criminal acts and the judge’s criteria for sentence redu

... (truncated, 98 KB total)

Resource ID: 9b9da45d4be8c368 | Stable ID: sid_vaKYOTo3h3