Algorithmic Pricing is a our proprietary new framework for building Pricing Models using Machine Learning. At the core of this framework is Penalised Regression which we supplement with 3 other learning algorithms to boost performance. To download the 2015 GIRO Paper and 2016 GI Pricing Seminar Presentation on Penalised Regression please sign up at the bottom of this page.
In this case study we benchmark Penalised Regression against the Traditional GLM Approach, using a claims frequency dataset.
The Data
This data set is based on one-year vehicle insurance policies taken out in 2004 or 2005. There are 67,656 policies of which 4,624 (6.8%) had at least one claim. The variables in this dataset are listed below.
Testing Protocol
- We split the data randomly into 50% Training and 50% Testing.
- We built a claim frequency model using both the Traditional and Algorithmic approach using only the Training data.
- We used the models developed in step 2 to predict the number of claims for the Test data
- We computed the error for each set of predictions.
We repeated the above process 10 times- each time we used a different seed for the random assignment in step 1. This bootstrapping approach allows us to obtain a confidence estimate in any observed difference in performance between the two modelling approaches.
Penalised Regression: A Clear Winner
- Penalised Regression outperforms the Traditional approach in 9/10 of the benchmark tests, with an average difference in Deviance R2 of 0.04%. This is a statistically significant result (p=0.024).
- Penalised Regression is able to detect much smaller effects than the Traditional Approach.
- One-way plots of Predicted vs Actual Claims Frequency showcase the predictive power of Penalised Regression.
Financial Impact
We estimate that a move to Penalised Regression would deliver at least a 1% improvement in loss ratio whilst increasing average premiums and total contribution. Learn why here.
The above example has used a small data set to show the power of Penalised Regression. In the real world however data sets are large – too large to fit in memory of most desktop computers. The relationship between inputs and response are often non-linear, and we need a way to intelligently search for interactions between inputs. Our “Algorithmic Pricing” approach overcomes all of these issues, which we have successfully implemented at a leading UK General Insurer. We never get tired of outperforming traditional GLM models though, so if you would like to explore how the latest in machine learning techniques can improve your pricing please contact us.