Skip to main content
JoCo2024 Logo
Go back to https://actuaries.org/events-library/ Back
General Insurance

Zero-Inflated Tweedie Boosted Trees with CatBoost for Insurance Analytics

Speakers: Emiliano Valdez
In this paper, we explore advanced modifications to the Tweedie regression model, addressing its limitations in modeling aggregate claims for various types of insurance such as automobile, health, and liability. Traditional Tweedie models, while effective in capturing the probability and magnitude of claims, usually fall short in accurately representing the large incidence of zero claims. Our approach involves a refined modeling of the zero-claim process, coupled with the integration of boosting methods. These methods help leverage an iterative process to enhance predictive accuracy, focusing on challenging cases identified by previous models without causing significant overfitting. Despite the inherent slowdown in learning algorithms due to this iteration, several efficient implementation techniques that also helps precise tuning of parameter like XGBoost, LightGBM, and CatBoost have emerged; however, we chose to utilize CatBoost, a boosting approach that effectively handles categorical and other special types of data. The core contribution of our paper is the assembly of separate modeling for zero claims and the application of tree-based boosting ensemble methods within a CatBoost framework, assuming inflated probability of zero is a function of the mean parameter. The efficacy of our enhanced Tweedie model is demonstrated through its application to an insurance telematics dataset, which presents the additional complexity of compositional feature variables. Our modeling results reveal a marked improvement in model performance, showcasing its potential to deliver more accurate predictions suitable for insurance claim analytics.
September 24, 2023