Record Details

E-learning Recommender System Dataset

Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)

View Archive Info
 
 
Field Value
 
Title E-learning Recommender System Dataset
 
Identifier https://doi.org/10.7910/DVN/BMY3UD
 
Creator Hafsa, Mounir
 
Publisher Harvard Dataverse
 
Description

Mandarine Academy Recommender System (MARS) Dataset is captured from real-world open MOOC {https://mooc.office365-training.com/}. The dataset offers both explicit and implicit ratings, for both French and English versions of the MOOC. Compared with classical recommendation datasets like Movielens, this is a rather small dataset due to the nature of available content (educational). However, the dataset offers insights into real-world ratings and provides testing grounds away from common datasets.


All items are available online for viewing in both French and English versions. All selected users had rated at least 1 item. No demographic information is included. Each user is represented by an id and job (if available).



For both French and English, the same kind of files is available in .csv format. We provide the following files:



  • Users: contains information about user ids and their jobs.

  • Items: contains information about items (resources) in the selected language. Contains a mix of feature types.

  • Ratings: Both explicit (Watch time) and implicit (page views of items).


Formatting and Encoding

The dataset files are written as comma-separated values files with a single header row. Columns that contain commas (,) are escaped using double quotes ("). These files are encoded as UTF-8.


User Ids

User ids are consistent between explicit_ratings.csv and implicit_ratings.csv and users.csv (i.e., the same id refers to the same user across the dataset).


Item Ids

Item ids are consistent between explicit_ratings.csv, implicit_ratings.csv, and items.csv (i.e., the same id refers to the same item across the dataset).


Ratings Data File Structure

All ratings are contained in the files explicit_ratings.csv and implicit_ratings.csv. Each line of this file after the header row represents one rating of one item by one user, and has the following format:



  • item_id,user_id,created_at (implicit_ratings.csv)

  • user_id,item_id,watch_percentage,created_at,rating (explicit_ratings.csv)


Item Data File Structure

Item information is contained in the file items.csv. Each line of this file after the header row represents one item, and has the following format:



  • item_id,language,name,nb_views,description,created_at,Difficulty,Job,Software,Theme,duration,type


 
Subject Computer and Information Science
Recommender Systems, E-Learning, Mooc, Implicit, Explicit, Interactions, Ratings
 
Language English
French
 
Date 2021-09-21
 
Contributor Hafsa, Mounir
 
Source Database