Record Details

A Tutorial on the Pretrain-Finetune Paradigm for Natural Language Processing

Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)

View Archive Info
 
 
Field Value
 
Title A Tutorial on the Pretrain-Finetune Paradigm for Natural Language Processing
 
Identifier https://doi.org/10.7910/DVN/QTT84C
 
Creator Wang, Yu
 
Publisher Harvard Dataverse
 
Description The pretrain-finetune paradigm represents a transformative approach in natural language processing (NLP). This paradigm distinguishes itself through the use of large pretrained language models, demonstrating remarkable efficiency in finetuning tasks, even with limited training data. This efficiency is especially beneficial for research in social sciences, where the number of annotated samples is often quite limited. Our tutorial offers a comprehensive introduction to the pretrain-finetune paradigm. We first delve into the fundamental concepts of pretraining and finetuning, followed by practical exercises using real-world applications. We showcase the paradigm's application in diverse tasks such as binary classification, multi-class classification, and regression. Emphasizing its efficacy and user-friendliness, the tutorial aims to encourage broader adoption of this paradigm. To this end, we have provided open access to all our code and datasets. The tutorial is particularly valuable for quantitative researchers in psychology, offering them an insightful guide into this innovative approach.
 
Subject Computer and Information Science
Social Sciences
 
Date 2024-03-02
 
Contributor Wang, Yu