Record Details

KenTrans: A Parallel Corpora for Swahili and local Kenyan Languages

Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)

View Archive Info
 
 
Field Value
 
Title KenTrans: A Parallel Corpora for Swahili and local Kenyan Languages
 
Identifier https://doi.org/10.7910/DVN/NOAT0W
 
Creator Wanzare, Lilian D.A
Indede, Florence
McOnyango, Owen
Ombui, Edward
Wanjawa, Barack
Muchemi, Lawrence
 
Publisher Harvard Dataverse
 
Description This project produced a parallel corpus between Swahili and 2 other Kenya Languages: Dholuo and Luhya. The Luhya Language has several dialects. In the project 3 dialects were chosen as a start: Lumarachi, Logooli and Lubukusi. A total of 12, 400 sentences were translated to Kiswahili from a sample of Dholuo, Luhya texts (1500 Dholuo-Kiswahili sentence pairs and 10,900 Luhya-Kiswahili sentence pairs). Each document contains sentence pairs, the sentence in the original language starts with letter “O” followed by a full colon (“O:”) while the translated Kiswahili sentence below it starts with letter “T” followed by a full colon (“T:”).



Acknowledgement of translators:


Luo - Swahili: Mercy Lavinca Oduoll (Coordinator), Bildad Okebe, Immaculate Ochieng, Mary Muma


Luhyia (Logooli) - Swahili: Phillip Lumwamu (Coordinator), Kints Mugoha Musungu, Vivian Alivitsa, Joseph Ambwere, Joyline Ingasiani


Luhyia (Bukusu) - Swahili: Martin Barasa Mulwale (Coordinator), Samwel Ralph Nyongesa, Tobias Shikuku, Phelisters N Simiyu


Luhyia (Marachi) - Swahili: Judith Awinja (Coordinator), Evans Owino, Belinda Oduor, Frankline Mwaro
 
Subject Computer and Information Science
Social Sciences
Translation
Machine Translation
Parallel Corpora
 
Language Swahili
 
Contributor WANZARE, LILIAN D.
LACUNA Fund
Maseno University
 
Type Parallel Corpora