KenTrans: A Parallel Corpora for Swahili and local Kenyan Languages
Harvard Dataverse (Africa Rice Center, Bioversity International, CCAFS, CIAT, IFPRI, IRRI and WorldFish)
View Archive InfoField | Value | |
Title |
KenTrans: A Parallel Corpora for Swahili and local Kenyan Languages
|
|
Identifier |
https://doi.org/10.7910/DVN/NOAT0W
|
|
Creator |
Wanzare, Lilian D.A
Indede, Florence McOnyango, Owen Ombui, Edward Wanjawa, Barack Muchemi, Lawrence |
|
Publisher |
Harvard Dataverse
|
|
Description |
This project produced a parallel corpus between Swahili and 2 other Kenya Languages: Dholuo and Luhya. The Luhya Language has several dialects. In the project 3 dialects were chosen as a start: Lumarachi, Logooli and Lubukusi. A total of 12, 400 sentences were translated to Kiswahili from a sample of Dholuo, Luhya texts (1500 Dholuo-Kiswahili sentence pairs and 10,900 Luhya-Kiswahili sentence pairs). Each document contains sentence pairs, the sentence in the original language starts with letter “O” followed by a full colon (“O:”) while the translated Kiswahili sentence below it starts with letter “T” followed by a full colon (“T:”). Acknowledgement of translators: Luo - Swahili: Mercy Lavinca Oduoll (Coordinator), Bildad Okebe, Immaculate Ochieng, Mary Muma Luhyia (Logooli) - Swahili: Phillip Lumwamu (Coordinator), Kints Mugoha Musungu, Vivian Alivitsa, Joseph Ambwere, Joyline Ingasiani Luhyia (Bukusu) - Swahili: Martin Barasa Mulwale (Coordinator), Samwel Ralph Nyongesa, Tobias Shikuku, Phelisters N Simiyu Luhyia (Marachi) - Swahili: Judith Awinja (Coordinator), Evans Owino, Belinda Oduor, Frankline Mwaro |
|
Subject |
Computer and Information Science
Social Sciences Translation Machine Translation Parallel Corpora |
|
Language |
Swahili
|
|
Contributor |
WANZARE, LILIAN D.
LACUNA Fund Maseno University |
|
Type |
Parallel Corpora
|
|