Record Details

Record Details

Replication Data for: Recalibration of Predicted Probabilities Using the "Logit Shift": Why does it work, and when can it be expected to work well?

Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)

View Archive Info


Field	Value

Title	Replication Data for: Recalibration of Predicted Probabilities Using the "Logit Shift": Why does it work, and when can it be expected to work well?

Identifier	https://doi.org/10.7910/DVN/MPCRPK

Creator	Rosenman, Evan Cory McCartan Santiago Olivella

Publisher	Harvard Dataverse

Description	Find herein simulation scripts for generating the three tables in our manuscript. Each script corresponds to a particular table (Table 1, Table 2, and Table 3). The Table 1 script is used to demonstrate the close adherence between the logit shift and the exact Poisson-Binomial posterior update. We consider two different sample sizes (n = 100 and n = 1000) and six data-generating distributions for a set of initial Democratic support scores. We assume that the true number of Democratic votes is 20% lower than predicted, and compute updated scores via both the logit shift and the exact Poisson-Binomial posterior probabilities. We show that the two sets of updated scores are virtually identical under several metrics. The Table 2 script demonstrates one of the drawbacks of the logit shift: namely, that it can skew the scores for minority groups within each precinct. We consider two racial groups for simplicity, and consider precincts whose proportions of White and Black voters are 70%-30%, 80%-20%, and 90%-10%. We suppose the Democratic support scores are overestimated by 10 percentage points for White voters and underestimated by 10 percentage points for Black voters. We consider the same six data-generating distributions for the initial support scores. We report relative shifts in correlation between the true scores and the initial scores vs. between the true scores and the logit shifted scores. In the simulations the logit shift improves score accuracy in aggregate and for White voters, but it produces worse scores for Black voters because they comprise a minority within the precinct. The Table 3 script demonstrates another drawback of the logit shift: it cannot correct for an incorrect shape of the initial score distributions, but only for an incorrect mean. We consider every possible pairing of our six data-generating distributions. For each pair, we sample 1,000 voters such that the true support probabilities follow the first distribution, and the initial support scores follow the second distribution, but the rank of each unit is identical within each of the two distributions. We again report relative shifts in correlation between the true scores and the initial scores vs. between the true scores and the logit shifted scores. If both the true support probability distribution and the initial score distribution have the same mean, the correlation shifts are essentially zero, while they are much more positive if the means differ.

Subject	Mathematical Sciences logit shift recalibration elections Poisson-Binomial distribution

Contributor	Rosenman, Evan