Abstract
Public health surveillance systems are critical for disease control, yet their methodological reliability in low-resource settings is often unquantified. In Ethiopia, despite system implementation, a rigorous evaluation of longitudinal performance and inherent measurement error is lacking. This study aimed to methodologically evaluate the reliability of national surveillance systems, estimating the consistency of reported incidence data over time and across regions to identify systemic weaknesses. We conducted a panel-data intervention study, analysing routine surveillance reports from a nationally representative sample of health facilities. Reliability was estimated using a generalisability theory framework within a linear mixed model: $X{prt} = \mu + \nup + \pir + \taut + (\pi\tau){rt} + \epsilon{prt}$, where $p$, $r$, and $t$ index persons, regions, and time. Robust standard errors were clustered at the facility level. The estimated coefficient of relative reliability for the system was 0.65 (95% CI: 0.58, 0.71), indicating substantial inconsistency. A key finding was that 34% of the variance in reported data was attributable to facility-level heterogeneity rather than true epidemiological signal. The surveillance systems demonstrated only moderate reliability, with facility-level operational factors constituting a major source of measurement error, compromising data utility for public health decision-making. Implement routine reliability audits integrated into surveillance procedures. Invest in targeted training and standardisation protocols to reduce facility-level reporting heterogeneity. surveillance evaluation, reliability, generalisability theory, panel data, measurement error, health systems This study provides a novel application of generalisability theory to health surveillance in a low-resource setting, producing the first quantitative, longitudinal reliability coefficient for the national system and isolating facility-level variance as a critical target for intervention.