Replication Data for: Improving the Selection of News Reports for Event Coding Using Ensemble Classification
Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)
View Archive InfoField | Value | |
Title |
Replication Data for: Improving the Selection of News Reports for Event Coding Using Ensemble Classification
|
|
Identifier |
https://doi.org/10.7910/DVN/2XSVWA
|
|
Creator |
Croicu, Mihai
Weidmann, Nils B. |
|
Publisher |
Harvard Dataverse
|
|
Description |
We introduce an automatic classification system to eliminate irrelevant source material for the coding of political event data from global news-wires. Our pipeline relies on a high-performance supervised heterogeneous ensemble classifier working on extremely unbalanced training classes. The output is then supplied to human coders for further information extraction, creating a semi-automatic pipeline. The package includes the software required to train and test the classifier, as well as documentation on how to use it. |
|
Subject |
Computer and Information Science
Social Sciences protest data text classification machine learning |
|
Contributor |
Croicu, Mihai
|
|