Record Details

Replication Data for: Disaggregating Repression: Identifying Physical Integrity Rights Allegations in Human Rights Reports

Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)

View Archive Info
 
 
Field Value
 
Title Replication Data for: Disaggregating Repression: Identifying Physical Integrity Rights Allegations in Human Rights Reports
 
Identifier https://doi.org/10.7910/DVN/GB77Q2
 
Creator Rebecca Cordell
K. Chad Clay
Christopher J. Fariss
Reed M. Wood
Thorin M. Wright
 
Publisher Harvard Dataverse
 
Description Most cross-national human rights datasets rely on human coding to produce yearly, country-level indicators of state human rights practices. Hand-coding the documents that contain the information on which these scores are based is tedious and time consuming but has been viewed as necessary given the complexity and detail of the information contained in the text. However, advances in automated text analysis have the potential to streamline this process without sacrificing accuracy. In this research note, we take the first step in creating this streamlined process by employing a supervised machine learning automated coding method that extracts specific allegations of physical integrity rights violations from the original text of country reports of human rights. This method produces a dataset including 163,512 unique abuse allegations in 196 countries between 1999 and 2016. This dataset and method will assist researchers of physical integrity rights abuse because it will allow them to produce allegation-level human rights measures that have previously not existed, and provide a jumping-off point for future projects aimed at using supervised machine learning to create global human rights metrics.
 
Subject Social Sciences
human rights
 
Contributor Fariss, Christopher