Record Details

Replication Data for: Improving the Selection of News Reports for Event Coding Using Ensemble Classification

Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)

View Archive Info
 
 
Field Value
 
Title Replication Data for: Improving the Selection of News Reports for Event Coding Using Ensemble Classification
 
Identifier https://doi.org/10.7910/DVN/2XSVWA
 
Creator Croicu, Mihai
Weidmann, Nils B.
 
Publisher Harvard Dataverse
 
Description We introduce an automatic classification system to eliminate irrelevant source material for the coding of political event data from global news-wires. Our pipeline relies on a high-performance supervised heterogeneous ensemble classifier working on extremely unbalanced training classes. The output is then supplied to human coders for further information extraction, creating a semi-automatic pipeline.

The package includes the software required to train and test the classifier, as well as documentation on how to use it.
 
Subject Computer and Information Science
Social Sciences
protest data
text classification
machine learning
 
Contributor Croicu, Mihai