# 431 | ResearchBox

ResearchBox # 431 - 'Colada[99] - Hyping Fisher'

Bingo Table
  Show file names
  Show file IDs
  Show timestamps
1. False-Positive Rates

  Annotated Table III.png


  results for figure in post 2021 10 09.csv

  [2] False-positives for HC0-HC4 - Table III - 2021 10 08.R

  [3] Make figure 2021 10 08.R

  F1 - Colada 99 - Figure 2021 10 08.png

(12 Mb)

  All simulations r1-r15 - Table III - 2021 10 04.rds

2. Outliers

  [1] Consequences of dropping 1 observation.R

Previewing files
Files can be previewed by clicking on descriptions.
Codebooks can be previewed by clicking on


Tell us if something is wrong with this Box


Uri SImonsohn, '[99] Hyping Fisher: The Most Cited 2019 QJE Paper Relied on an Outdated STATA Default to Conclude Regression p-values Are Inadequate', Data Colada

All content posted to ResearchBox is under a CC By 4.0 License (all use is allowed as long as authorship of the content is attributed). When using content from ResearchBox please cite the original work, and provide a link to the URL for this box (https://researchbox.org/431).

October 12, 2021   (files may not be changed, deleted, or added)

Uri Simonsohn (urisohn@gmail.com)

The paper titled "Channeling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results" (.htm) is currently the most cited 2019 article in the Quarterly Journal of Economics (372 Google cites). It delivers bad news to economists running experiments: their p-values are wrong. To get correct p-values, the article explains, they need to run, instead of regressions, something called "randomization tests" (I briefly describe randomization tests in the next subsection).  For example, the QJE article reads: "[the results] show the clear advantages of randomization inference . . . [it] is superior to all other methods" (p.585). In this post I show that this conclusion only holds when relying on an unfortunate default setting in STATA. In contrast, when regression results are computed using the default setting in R, the supposed superiority of the randomization test goes away.


The authors have written the following message for visitors to this box.
Please note that these messages can be modified or deleted at any point (even after a box is made permanent)

Original analyses
The original paper by Young (2019),"Channeling Fisher", is available from https://doi.org/10.1093/qje/qjy029

MATA code, (STATA programming language for matrix calculations) to reproduce calculations in the original paper is available from:

Young, Alwyn, 2018, "Replication Data for: 'Channeling Fisher: Randomization Tests and the Statistical Insignificance of Seemingly Significant Experimental Results'", https://doi.org/10.7910/DVN/JX6HCJ, Harvard Dataverse, V1 (https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/JX6HCJ ) 

The MATA code for the simulations is about 1500 lines of code without comments (beyond section headers), so I coded in R from scratch, following only the descriptions in the paper.

The QJE article also reports reanalyses of 53 published papers, but the data for those papers is not posted, nor linked to from the online materials for the QJE paper. To reproduce those analyses (about instability of results to removal of individual observations) you'd need to on your own find the datasets for the 53 original papers, figure out the code used in the original studies, then understand the code behind the QJE paper in relation to that original code. 


This version: October 06, 2021