Vol. 2006 No. 1 (2006)
Natural Language Processing Frontiers in African Indigenous Languages of Uganda
Abstract
Natural Language Processing (NLP) has shown promise in transforming how information is processed and utilised across various languages globally. However, its application in African indigenous languages remains underexplored, particularly in Uganda where multilingualism is prevalent. A comprehensive literature review was conducted to identify existing research on NLP applications in Ugandan indigenous languages. A series of interviews with language experts and software developers were also carried out to gather insights into the current landscape and future prospects. While significant progress has been made, there is a notable lack of standardised datasets for training models specific to these languages, which limits model performance and generalisation across different dialects. The findings suggest that while challenges remain, concerted efforts towards developing robust NLP solutions tailored to Ugandan indigenous languages could lead to substantial advancements in multilingual information processing. Developing a collaborative framework between academia and industry stakeholders is recommended to accelerate the creation of reliable datasets. Furthermore, promoting research funding specifically for African languages within the context of NLP can be beneficial. Model estimation used $\hat{\theta}=argmin_{\theta}\sum_i\ell(y_i,f_\theta(x_i))+\lambda\lVert\theta\rVert_2^2$, with performance evaluated using out-of-sample error.