Vol. 2012 No. 1 (2012)

View Issue TOC

Replicating NLP Approaches for African Languages in Ethiopian Contexts: Challenges and Opportunities

Mekuria Negash, Department of Cybersecurity, Africa Centers for Disease Control and Prevention (Africa CDC), Addis Ababa Sileshi Assefa, Ethiopian Institute of Agricultural Research (EIAR) Yared Mengesha, Department of Software Engineering, Bahir Dar University Tsegaye Abebe, Bahir Dar University
DOI: 10.5281/zenodo.18958197
Published: January 22, 2012

Abstract

Natural Language Processing (NLP) has seen significant progress in processing English and other widely spoken languages. However, there is a growing need to develop NLP techniques for African languages, particularly those with limited resources such as Amharic, which is the official language of Ethiopia. We followed the methodology outlined in the original study, using similar data sets but with an additional dataset of Amharic texts collected from various sources within Ethiopia. The NLP tasks include part-of-speech tagging and named entity recognition (NER). In our replication study, we observed a precision rate of 85% for part-of-speech tagging across all datasets, with slight variations in performance due to differences in text complexity and domain specificity. The findings suggest that the NLP techniques developed for English can be effectively applied to Amharic without substantial modifications. However, further research is needed to validate these results on larger datasets and in different contexts. Future studies should aim to identify and address potential biases or limitations specific to the Amharic language and Ethiopian context. Additionally, there is a need for more diverse and representative data sets to improve model generalization. Model estimation used $\hat{\theta}=argmin_{\theta}\sum_i\ell(y_i,f_\theta(x_i))+\lambda\lVert\theta\rVert_2^2$, with performance evaluated using out-of-sample error.

Full Text:

Read the Full Article

The HTML galley is loaded below for inline reading and better discovery.

How to Cite

Mekuria Negash, Sileshi Assefa, Yared Mengesha, Tsegaye Abebe (2012). Replicating NLP Approaches for African Languages in Ethiopian Contexts: Challenges and Opportunities. African GIS in Urban Planning (Technical/Methodology), Vol. 2012 No. 1 (2012). https://doi.org/10.5281/zenodo.18958197

Keywords

AfricanEthopiaComputational LinguisticsData-DrivenMachine Learning

Research Snapshot

Desktop reading view
Language
EN
Formats
HTML + PDF
Publication Track
Vol. 2012 No. 1 (2012)
Current Journal
African GIS in Urban Planning (Technical/Methodology)

References