Adaptive models: Models that adjust their parameters or structure dynamically over time in response to new data or changing conditions. For example, the variables selected as well as their relative importance in Dynamic Elastic Net Model (DynENet) may vary from week to week.
Agent-based modelling (ABM): A simulation approach that models the actions and interactions of individual agents (e.g. individuals or households) to explore how individual decisions shape overall dynamics.
Auto-Regressive Integrated Moving Average (ARIMA): ARIMA model is the most general class of models for forecasting a time series which can be made to be “stationary” by differencing in conjunction with nonlinear transformations such as logging or deflating. A non-seasonal ARIMA model is classified as an “ARIMA(p,d,q)” model, p is the number of autoregressive terms, d is the number of non-seasonal differences needed for stationarity, and q is the number of lagged forecast errors in the prediction equation.
Auto-Regressive Integrated Moving Average with eXogenous inputs (ARIMAX): The ARIMAX model extends the ARIMA framework by integrating exogenous variables, which are external predictors of factors that can improve the forecast accuracy.
Bayesian Network Models: Probabilistic graphical models representing variables and their conditional dependencies via a Directed Acyclic Graph (DAG), useful for modelling complex causal structures.
Bayesian Structural Time Series (BSTS): A Bayesian time series model that decomposes data into trend, seasonal, and regression components. Useful for forecasting and causal inference under uncertainty.
Calibration curves/scores: Tools to assess how well predicted probabilities match observed outcomes. A well-calibrated model outputs probabilities that correspond closely to actual event frequencies.
Causal forecasting models: Models that incorporate the mechanisms and motivations behind migration decisions, often using individual-level data to distinguish between types of migrants (e.g. economic vs. forced).
Classification-based migration forecasting: Forecasting approaches that predict the category (e.g. low/medium/high) of future migration volumes rather than precise numerical counts.
Directed Acyclic Graph (DAG): A finite graph with directed edges and no cycles, often used to represent causal relationships in Bayesian networks or structural models.
Dynamic causal models: Time‑sensitive causal models that account for changes in variable relationships over time, often used to simulate the impact of shocks or policies on migration behaviour.
Exponential smoothing models: Special cases of ARIMA models (ARIMA (0,1,1)) which use an average of the last few observations rather than the most recent observation to filter out the noise and more accurately estimate the local mean. Often used for short-term forecasting of series without strong trends or seasonality.
Importance‑Frequency (IF) space: A two‑dimensional space used to visualise and assess model features based on Importance (How strongly a feature influences the model’s predictions and Frequency (How often the feature appears or is used across different models, forecasts, or simulations.).
Least Absolute Shrinkage and Selection Operator (LASSO): A regression technique that includes a penalty for large coefficients, encouraging simpler, more interpretable models and reducing overfitting.
Long Short-Term Memory (LSTM): A type of Recurrent Neural Network (RNN) that captures long-term dependencies in sequential data, commonly used for time series forecasting.
Machine Learning: A class of algorithms that identify patterns in data and improve their performance with experience, widely used in predictive migration models.
Macro-Simulation: A modelling technique that uses aggregate data and equations to forecast large‑scale migration flows.
Mean Absolute Error (MAE): A metric that measures the average magnitude of forecast errors, regardless of direction.
Mean Absolute Percentage Error (MAPE): A metric that expresses forecast errors as a percentage of the actual values, useful for comparing errors across datasets.
Mean Absolute Scaled Error (MASE): A scale‑independent metric for forecast accuracy, useful for comparing models across different scales and time series.
Mean Percentage Error (MPE): A metric that captures the average percentage bias of forecasts; it indicates whether a model tends to over- or under-predict.
Micro-simulation: A modelling approach simulating individual-level behaviours based on detailed characteristics and decision making rules.
Modern deep-learning techniques: Advanced neural network models (e.g. LSTM, RNN, transformers) capable of learning complex, non-linear relationships in high-dimensional data.
Overfitting: When a model fits training data too closely, capturing noise rather than generalizable patterns, resulting in poor performance on new data.
Partial Dependence Plots (PDPs): Visualization tools showing the marginal effect of one or two input features on the model’s predictions.
Probability Density Function (PDFs): Function describing the likelihood of a continuous random variable taking on certain values. The total area under the curve of a PDF is equal to one, and the probability that the variable falls within a specific interval is given by the integral of the PDF over that interval.
Random Walk model: A time series model where the current value is equal to the previous value plus a random error term. Random walks are non-stationary processes, and the model often serves as naïve benchmarks in forecasting, representing the idea that the best prediction for tomorrow is today’s value. Recursive Neural Networks: Neural networks that apply the same weights recursively across structured data (e.g. syntactic trees), not to be confused with Recurrent Neural Networks (RNNs).
Recurrent Neural Network (RNN): A type of neural network particularly well-suited for sequential data, such as time series, due to its memory of previous inputs.
Ridge penalty: A regularisation technique that adds a penalty term to regression coefficients to reduce overfitting and improve model generalisation.
Root Mean Square Error (RMSE): A commonly used forecast error metric that penalises large errors more heavily due to squaring the error terms.
Saliency Maps: Tools used in neural networks to visualise which input features most influence the model’s predictions.
SARIMA (Seasonal ARIMA): An extension of ARIMA that incorporates seasonal components, useful for forecasting time series with repeating seasonal patterns.
SHAP (SHapley Additive exPlanations): A model-agnostic interpretability method that assigns each input feature an importance value for individual predictions based on co‑operative game theory.
Stationary time series: Time series whose statistical properties such as mean, variance, and autocorrelation are constant over time, i.e. which does not exhibit trends or seasonal effects, and its properties do not change when shifted in time. Stationarity is an important assumption for many time series models.
Time series-specific metrics: Metrics designed for evaluating time series forecasts, including MASE, Theil’s U-statistic, and others.
Traditional metrics: Commonly used forecast error metrics such as MAE, MAPE, RMSE, and MPE.
Vector Auto-Regressive (VAR) model: A multivariate time series model where each variable is modelled as a function of past values of itself and other variables in the system.