Two years ago, our research on recovery rate prediction demonstrated that integrating machine learning techniques—such as neural networks and ensemble methods—with rule-based models such as Cubist, significantly outperformed traditional regression models (Gavriilidis and Heppe 2023). Moreover, while servicer-provided recovery expectations (Business Plans, BP) proved less reliable as stand-alone predictors, their inclusion as model features substantially enhanced predictive performance. These findings highlighted the benefits of advanced modelling techniques, the value of combining industry expertise with data-driven approaches, and emphasised the importance of routinely updating existing BPs.
Building on this foundation, our latest research—conducted in collaboration with Job Reijns, a master’s student at Erasmus University Rotterdam—expands our dataset, refines our methodologies, and introduces a new suite of models, including Random Forests and XGBoost. By incorporating engineered features, we observe improved predictive accuracy in short-term forecasts, although long-term forecasts remain more challenging. These results not only confirm our previous performance gains with machine learning models but also illuminate the trade-offs between enhanced accuracy, increased model complexity, and reduced interpretability.
The improved short-term predictive accuracy offers direct business benefits by enabling financial institutions to:
- Enhance loan pricing accuracy and reduce the risk of miss valuation,
- Optimise debt collection strategies through more precise recovery forecasts, and
- Strengthen investor confidence in NPL securitisations with transparent, data-driven insights.
Job Reijns –Forecasting Recovery Rates of Non-Performing Loans – Master Thesis Quantitative Finance, Erasmus University Rotterdam





