Record Details

IMF Raw Text Files

Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)

View Archive Info
 
 
Field Value
 
Title IMF Raw Text Files
 
Identifier https://doi.org/10.7910/DVN/CN0PR9
 
Creator Collodel, Umberto
Betín, Manuel
 
Publisher Harvard Dataverse
 
Description Description: Raw text database of roughly 23,000 documents - country reports and program related - covering the whole IMF membership throughout the period 1950-2019. Extraction performed using Google Cloud Vision (See paper for more info).

Structure: Each .RDS file contains a named list with the individual documents for the country (ISO3 Code) as character string. String numbering corresponds to page numbering.

We also provide an S3 method (print.corpusTM) to not overcrowd the console: run the script before loading the RDS files. Instead of the whole country corpus, it will display the number of docs in the corpus and the name of the first and last document.
 
Subject Social Sciences
imf documents, imf archives, text mining
 
Contributor Collodel, Umberto