Evaluating and Benchmarking Time Series Foundation Models for Public Healthcare Demand Forecasting

Forecasting
Time Series Foundation Models
Public Health
Supply Chains
Probabilistic Forecasting
Model Evaluation
International Symposium on Forecasting 2026
Author

Harsha Halgamuwe Hewage

Published

June 30, 2026

Evaluating and Benchmarking Time Series Foundation Models for Public Healthcare Demand Forecasting

International Symposium on Forecasting 2026

Context

Public health supply chains must forecast demand across many related but individually sparse, intermittent, and weakly structured time series. Time series foundation models appear promising because they may transfer patterns learned from large collections of data, reducing the need for long local histories and repeated model development.

This talk examines whether that promise translates into better forecast quality than established statistical baselines. Using monthly family planning demand data from Côte d’Ivoire, Lao PDR, and Pakistan, we compare point and probabilistic performance across one-, three-, and six-month forecast horizons.

The evaluation moves beyond aggregate accuracy to consider calibration, computational runtime, implementation robustness, maintainability, and debuggability. The results show that a small group of foundation models achieves strong forecast rankings, but performance is uneven, aggregate loss can conceal miscalibration, and the largest accuracy gains may require substantial computational resources.

The operational question is therefore not simply whether to use a foundation model, but which model provides enough forecast quality to justify its complexity under real public health system constraints.