Back
Deepfake Detection Challenge Dataset
webdeepfakedetectionchallenge.ai·deepfakedetectionchallenge.ai
A key benchmark dataset for AI-generated media detection research; relevant to AI safety discussions around synthetic media misuse, detection capabilities, and evaluation of countermeasures against harmful deepfake technology.
Metadata
Importance: 55/100dataset
Summary
The Deepfake Detection Challenge (DFDC) Dataset, released by Meta/Facebook AI in 2020, is a large-scale benchmark dataset of over 124,000 videos designed to accelerate research in detecting AI-generated manipulated media. Created in partnership with industry and academic leaders, it features videos with multiple facial modification algorithms applied to paid actors. The dataset was used in a Kaggle competition and is publicly available to support ongoing deepfake detection research.
Key Points
- •Full dataset contains 124,000 videos featuring eight different facial modification algorithms, with a smaller 5k-video preview dataset also available.
- •Created by Facebook/Meta AI in collaboration with industry and academic partners as part of the Deepfake Detection Challenge launched in September 2019.
- •Used in a Kaggle competition to benchmark and develop new deepfake detection models from researchers worldwide.
- •Dataset was ethically created using paid actors who consented to the use and manipulation of their likenesses.
- •Requires AWS account setup with IAM credentials to access; associated research papers available on arXiv (2006.07397 and 1910.08854).
Cited by 1 page
| Page | Type | Quality |
|---|---|---|
| AI-Enabled Historical Revisionism | Risk | 43.0 |
Cached Content Preview
HTTP 200Fetched Apr 7, 20265 KB
Deepfake Detection Challenge Dataset
Meta AI
AI Research
The Latest
About
Get Llama
Try Meta AI
JUNE 25, 2020
Deepfake Detection Challenge Dataset
The Deepfake Detection Challenge Dataset is designed to measure progress on deepfake detection technology.
Download the Dataset Download the Paper Read the Article Overview
We partnered with other industry leaders and academic experts in September 2019 to create the Deepfake Detection Challenge (DFDC) in order to accelerate development of new ways to detect deepfake videos. In doing so, we created and shared a unique new dataset for the challenge consisting of more than 100,000 videos. The DFDC has enabled experts from around the world to come together, benchmark their deepfake detection models, try new approaches, and learn from each others’ work.
The DFDC dataset consists of two versions:
Preview dataset 5k videos
Featuring two facial modification algorithms
Associated research paper
Full dataset 124k videos
Featuring eight facial modification algorithms
Associated research paper
This full dataset was used by participants during a Kaggle competition to create new and better models to detect manipulated media. The dataset was created by Facebook with paid actors who entered into an agreement to the use and manipulation of their likenesses in our creation of the dataset.
We hope that by making this dataset available outside the challenge, the research community will continue to accelerate progress on detecting harmful manipulated media.
Facebook AI’s work in this space can be found in this blog post for more information.
If using this dataset, please cite the paper associated with the relevant dataset (preview/full):
@misc{DFDC2019Preview,
title={The Deepfake Detection Challenge (DFDC) Preview Dataset},
author={Brian Dolhansky, Russ Howes, Ben Pflaum, Nicole Baram, Cristian Canton Ferrer},
year={2019},
eprint={1910.08854},
archivePrefix={arXiv},
primaryClass={cs.CV}}
}
@misc{DFDC2020,
title={The DeepFake Detection Challenge Dataset},
author={Brian Dolhansky, Joanna Bitton, Ben Pflaum, Jikuo Lu, Russ Howes, Menglin Wang, Cristian Canton Ferrer},
year={2020},
eprint={2006.07397},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Download Prerequisites
In order to access the datasets, each user must have an AWS account with an IAM user and Access Keys setup. Each user must also have an AWS account number ready in order to sign up and access the datasets.
1
Create an AWS account
2
Create an IAM user
3
Make note of the AWS account ID
Download the Dataset Results and Impact of the DFDC
The top-performing model on the public dataset achieved 82.56 percent average precision, a common accuracy measure for computer vision tasks. But whe
... (truncated, 5 KB total)Resource ID:
4d7d6773b35b5278 | Stable ID: sid_3kSaKGWi6C