Skip to content
Longterm Wiki
Back

Carlisle (2017) - Statistical Analysis of Anaesthesia Research Data Fabrication

web

Credibility Rating

4/5
High(4)

High quality. Established institution or organization with editorial oversight and accountability.

Rating inherited from publication venue: Wiley Online Library

This medical research integrity paper is tangentially relevant to AI safety insofar as it demonstrates statistical methods for detecting manipulation and fabrication in datasets — concepts loosely applicable to evaluation integrity and dataset auditing in AI research.

Metadata

Importance: 18/100journal articleprimary source

Summary

This paper by John Carlisle published in Anaesthesia (2017) presents a statistical method for detecting data fabrication and manipulation in clinical trials, analyzing baseline variables to identify anomalous distributions inconsistent with random allocation. The work is notable for uncovering widespread data irregularities in anaesthesia research, leading to numerous retractions.

Key Points

  • Develops a statistical technique to detect implausible baseline data distributions in randomized controlled trials, suggesting fabrication or manipulation.
  • Applied the method to a large corpus of anaesthesia trials, identifying numerous papers with highly suspicious statistical patterns.
  • The analysis contributed to major retractions in anaesthesia literature, exposing research integrity failures.
  • Demonstrates how rigorous statistical scrutiny can serve as a tool for post-publication peer review and fraud detection.
  • Has broader implications for research integrity across scientific fields, including any domain relying on randomized trials.

Cached Content Preview

HTTP 200Fetched Apr 7, 20264 KB
# Data fabrication and other reasons for non‐random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals
Authors: J. B. Carlisle
Journal: Anaesthesia
Published: 2017-08
DOI: 10.1111/anae.13938
## Abstract

Summary Randomised, controlled trials have been retracted after publication because of data fabrication and inadequate ethical approval. Fabricated data have included baseline variables, for instance, age, height or weight. Statistical tests can determine the probability of the distribution of means, given their standard deviation and the number of participants in each group. Randomised, controlled trials have been retracted after the data distributions have been calculated as improbable. Most retracted trials have been written by anaesthetists and published by specialist anaesthetic journals. I wanted to explore whether the distribution of baseline data in trials was consistent with the expected distribution. I wanted to determine whether trials retracted after publication had distributions different to trials that have not been retracted. I wanted to determine whether data distributions in trials published in specialist anaesthetic journals have been different to distributions in non‐specialist medical journals. I analysed the distribution of 72,261 means of 29,789 variables in 5087 randomised, controlled trials published in eight journals between January 2000 and December 2015: Anaesthesia (399); Anesthesia and Analgesia (1288); Anesthesiology (541); British Journal of Anaesthesia (618); Canadian Journal of Anesthesia (384); European Journal of Anaesthesiology (404); Journal of the American Medical Association (518) and New England Journal of Medicine (935). I chose these journals as I had electronic access to the full text. Trial p values were distorted by an excess of baseline means that were similar and an excess that were dissimilar: 763/5015 (15.2%) trials that had not been retracted from publication had p values that were within 0.05 of 0 or 1 (expected 10%), that is, a 5.2% excess, p = 1.2 × 10 −7 . The p values of 31/72 (43%) trials that had been retracted after publication were within 0.05 of 0 or 1, a rate different to that for unretracted trials, p = 1.03 × 10 −10 . The difference between the distributions of these two subgroups was confirmed by comparison of their overall distributions, p = 5.3 × 10 −15 . Each journal exhibited the same abnormal distribution of baseline means. There was no difference in distributions of baseline means for 1453 trials in non‐anaesthetic journals and 3634 trials in anaesthetic journals, p = 0.30. The rate of retractions from JAMA and NEJM , 6/1453 or 1 in 242, was one‐quarter the rate from the six anaesthetic journals, 66/3634 or 1 in 55, relative risk (99% CI ) 0.23 (0.08–0.68), p = 0.00022. A probability threshold of 1 in 10,000 identified 8/72 (11%) retracted trials (7 by Fujii et al.) and 82/5015 (1.6%) unretracted trials. Some p values were so extreme that

... (truncated, 4 KB total)
Resource ID: 31583ff3c5f0be0d | Stable ID: ZDE2YzA4MT