101-B
Unsupervised Learning Methods Applied to Property-Casualty Databases

Thursday, April 3, 2014: 10:30 a.m.
Virginia Suite ABC (Washington Marriott Wardman Park)
Predictive modeling can be divided into two major kinds of modeling, referred to as supervised and unsupervised learning, distinguished primarily by the presence or absence of dependent/target variable data in the data set used for modeling.  Supervised learning approaches probably account for the majority of modeling analyses.  This paper will focus on two infrequently used unsupervised approaches:

Thus, unsupervised learning is a kind of analysis where there is no explicit dependent variable.  Examples of unsupervised learning in insurance include modeling of questionable claims (foe some action such as referral to a Special Investigation Unit) and the construction of territories by grouping together records that are geographically “close” to each other. Databases used for detecting questionable claims analysis often do not contain a fraud indicator as a dependent variable.  Unsupervised learning methods are often used to address this limitation.  The PRIDIT (Principal Components of RIDITS) and Random Forest (a tree based data-mining method) unsupervised learning methods will be introduced.  We will apply the methods to an automobile insurance database to model questionable[1] claims.

A simulated database containing features observed in actual questionable claims data was developed for this research.  The database is available from the author.


[1] The simulated data is based on a research database originally constructed to investigate claims that were suspected not to be legitimate, such as staged accidents and inflated damages.  The term “fraudulent” is generally not used in referring to such claims as claims that meet the definition of criminal fraud are a very small percentage of claims.

Presentation 1
Louise Francis, Consulting Principal, Francis Analytics & Actuarial Data Mining Inc
Handouts
  • Two Unsupervised Learning Techniques v3.1.docx (256.8 kB)