Abstract
{ "background": "Evaluating the cost-effectiveness of water treatment systems is critical for infrastructure investment and maintenance planning. Current assessments often rely on deterministic models that fail to adequately account for spatial heterogeneity, temporal variability, and inherent uncertainties in operational and financial data.", "purpose and objectives": "This Data Descriptor presents a novel Bayesian hierarchical model designed to diagnose and measure the cost-effectiveness of municipal water treatment facilities. The objective is to provide a robust methodological framework that quantifies efficiency while formally incorporating uncertainty.", "methodology": "The methodology centres on a Bayesian hierarchical model specified as $\\text{log}(\\text{Cost}{it}) = \\alpha{j[i]} + \\beta X{it} + \\epsilon{it}$, with $\\alphaj \\sim \\text{Normal}(\\mu{\\alpha}, \\sigma^2_{\\alpha})$, where $i$, $t$, and $j$ index facilities, time, and municipalities, respectively. The model integrates plant-level operational data with regional economic variables, using Hamiltonian Monte Carlo for inference. Prior distributions were informed by expert elicitation and historical performance benchmarks.", "findings": "The model application reveals substantial regional disparities, with the posterior distribution indicating that facilities in one major metropolitan province are, on average, 15-25% less cost-effective than the national mean (95% credible interval). The hierarchical structure shows that approximately 40% of the variation in log costs is attributable to municipality-level random effects.", "conclusion": "The proposed model provides a statistically rigorous framework for cost-effectiveness diagnostics, moving beyond point estimates to a full probabilistic characterisation. It successfully identifies systemic inefficiencies and their geographic patterning.", "recommendations": "Adoption of this modelling approach is recommended for national infrastructure audits and targeted capital refurbishment programmes. Future work should integrate real-time sensor data to enable dynamic, predictive diagnostics.", "key words": "Bayesian inference, hierarchical modelling, infrastructure economics, water treatment, cost-effectiveness, uncertainty quantification", "contribution statement": "