JUNE 30, 2004
VOLUME 1 NO. 13
 

Mistakes are in our Nature — and BMJ

Statistical errors are commonplace even in high-profile publications like Nature and BMJ


There are lies, there are damned lies and then there are statistics. At the heart of almost every research paper lies a P value, a number that sums up the statistical validity of research results. The lower the P number, the less likely it is that the results are a fluke. Large numbers of participants tend to drive the P number down, as do wide gaps in outcome between treatment and placebo groups.

It's widely accepted that a P number below 0.05 is a sign of statistical significance. With so much research being published, this provides a quick and handy reference to the statistical power of a study. Why bother reading the detailed methodology when all of these factors can be neatly summed up in a single figure? It's a great shortcut — if the P number can be trusted. But a review published in the May 28 issue of BMC Medical Research Methodology suggests that this trust might be misplaced.

The P number is easy to read, but hellishly complicated to generate and involves a set of skills that most clinicians lack. In fact, doctors' dread of the Cox proportional hazards regression model or the Poisson ratio is one of the major obstacles to research. Only the biggest trials can afford the luxury of a specialized statistician. The rest must muddle along hoping that peer review will catch any howlers.

Emili Garcia-Berthou and Carles Alcaraz, of the University of Girona in Spain, set out to learn if the system is working. Their conclusion? Well, no, not really. No fewer than 11.6% and 11.1% of the statistical results published in Nature and the British Medical Journal (BMJ), respectively, during 2001 were wrong. A whopping 38% of the papers in Nature contained at least one such error, as did 25% in the BMJ.

Most of the inaccuracies appeared to be due to errors in transcription or in rounding-off, rather than to bad math. In other words, the very stage when errors are supposed to be caught was the stage when most errors appeared.

Mercifully, only one of the 28 errors found actually turned a nonsignificant result into an apparently significant one. But that appears to be mostly a matter of luck, because 12% of the errors changed the significance level of the P value by at least one order of magnitude.

"Although these kinds of errors may leave the conclusions of a study unchanged, they are indicative of poor practice," said the authors. "The quality of research and scientific papers need improvement and should be more carefully checked and evaluated in these days of high publication pressure."

They suggested that one way to minimize the errors would be for journals to publish the raw data on the internet. Richard Smith, editor of the BMJ, agreed that such an approach might help. Philip Campbell, editor-in-chief of Nature, said his journal has changed its editing practices since 2001, but added that Nature will examine the study's findings before deciding whether further changes are needed.

 

 

back to top of page

 

 

 

 
 
© Parkhurst Publishing Privacy Statement
Legal Terms of Use
Site created by Spin Design T.