Replication Data for: A natural language measure of ideology in the Brazilian Senate
Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)
View Archive InfoField | Value | |
Title |
Replication Data for: A natural language measure of ideology in the Brazilian Senate
|
|
Identifier |
https://doi.org/10.7910/DVN/JSZI4V
|
|
Creator |
Felipe Carneiro
Bernardo Mueller Daniel O. Cajueiro |
|
Publisher |
Harvard Dataverse
|
|
Description |
We first obtained the documents from Dados Abertos using data scraping. Then, we built our data set according to each legislature, so that each legislature has its very own corpus. For each legislature, there is a specific set of politicians that can have one or more speeches. We treat each speech as a single document, in the sense that we don’t concatenate all speeches from a specific politician in a single document. So that at the end of our pre-processing phase we have five set of speeches (i.e., five corpora) in which each of them comprises a specific set of politicians with one or more speeches. Another important step is labelling the documents. As described in The Model (main text), we make use of Power and Zucco Jr. (2009) ideological estimation of the main Brazilian parties. They use a multidimensional scaling technique based on survey data and roll-call votes data. Their classification assigns each party to a specific ideological position, in the way that we have parties at the left, center or right . So that given the politician’s party, we can assign their speeches to its respective ideological class. |
|
Subject |
Social Sciences
|
|
Contributor |
Ciência Política, Revista Brasileira de
|
|