This document discusses how machine learning and big data analytics can transform the insurance industry. It provides an overview of how automated machine learning works and its benefits for insurers, including higher returns on investment. Specific use cases discussed include underwriting triage, pricing, claims management, and fraud prevention. The document also addresses key data challenges for insurers and how a unified data platform can help bring different data sources together for machine learning. It promotes the idea that automated machine learning solutions can make machine learning more accessible, affordable and inclusive for organizations.
2. Agenda
Mihaela Risca
Sr. Solutions Marketing Manager
Financial Services
Cloudera
Unlocking the Value of Insurance Data
Satadru Sengupta
Gen Mgr. Insurance
DataRobot
Automated Machine Learning – A Formula
for Higher ROI for Insurers
3. There are two different
alignments of these components
in the market:
• When data and analytics
capability are bundled with
capital, we have an insurance
company.
• When it is bundled with
demand, we have an advisor or
broker
Data is at the center of the Insurance market
8. Why Big Data + Machine Learning?
• Machine learning thrives on
growing data sets
• Bring disparate data
sources together
• Real time streaming
9. Machine Learning Use Cases in Insurance
Pricing
Customer Acquisition Underwriting
Marketing, customer
retention, prioritization.
Equating risk and price,
driving life-time value
(LTV)
Prevent Claim Fraud
Underwriting triage:
select the top 10% of the
available risk for further
analysis .
Identifying claims with
highest likelihood of being
fraudulent.
10. Poll the Audience
Where in your organization you see the most value for introducing
machine learning?
1. Customer acquisition and retention
2. Underwriting/Actuarial
3. Quoting/Claims management
4. Fraud detection and prevention
5. Other
11. Key Data Management Challenges for Insurers
Fragmented Systems
and Data Silos
Limited Access to
Right Data at the
Right Time
Strategic Decisions
Based on Subsets of
Data
Unable to Tap into
New Data Sources or
Correlate Data from
Multiple Sources
Simultaneously
Disparate View of
Customers, Markets
and Risks
Poor Data Quality
and Lack of
Governance
12. One Data Platform for Many Applications
Handle real-time
data ingest from
diverse sources
Governance and
Security
Data Streams
Deployment Flexibility
Machine Learning
Capabilities
Diverse Analytical
Options
Combine Data from Different Sources
Data Mgmt. Hub
Scale easily & Cost
effectively
Batch or Real- time
Data Streams
Data Sources
Data Sources
Data Storage &
Processing
Reporting, Analytics &
Auditing
Data Ingest
Other
Data Governance (Data Lineage, Data Protection)
Fitness Car Telematics
Applications
13. "New technology is transforming the
way we work, and it is allowing the
competition to do better than what we
can. The strange thing is we know the
urgency, and yet there is inertia."
Inga Beale, CEO of Lloyd's of London
February 2017
14. 1. Technology
2. Consumer & Market Economics
3. Data Science & Machine Learning
… and they are interconnected.
Three Strategic Areas of Focus
15. Machine Learning Applications in Insurance
1. Risk Selection & Pricing
2. Claims, Fraud and Litigation Management
3. Operations and Expenses Management
“machine learning is the secret sauce for the product of
tomorrow.” Google, 2015
16. Profitable Growth & Managing Expenses
Becoming a 21st Century Insurance Company
17. Life Insurance Example 1
Underwriting Triage
• Predicted low risk to fast track
process
• Predicted high risk to traditional
underwriting for manual review
Business Impact
• Cost reduction through automation of
reviews of applicants
• Increased likelihood of acquisition
due to fast track underwriting
• Higher underwriting profitability by
targeting the review process on
underwriting loss avoidance
Specific examples from clients
• Predict the likelihood of an insured being in a preferred class or not – as
determined by risk factors such as smoking status, existing condition, terminal
disease
• Predict the most likely class among several classes
18. Predict mortality risks among patients in remission of cancer:
○ Simplify Underwriting Process: Patients with good health prospects don’t need to go
through a manual medical verification and avoid adverse selection
○ Reduce Costs of Claim by identifying high-risk patients and create more accurate
underwriting rules
ML model predicts patients with
a very high risk of mortality
● 5 times more risky than
average
● Around 10% of patients
Life Insurance Example 2
20. Machine Learning Strategy: Where It Is Failing?
• A lack of data vision
• Hiring and retaining good data scientists is impossible
• Lack of Inclusiveness: Targeted end-users are not included in
the machine learning problem solving process.
HBR Article : “Stop searching for that elusive Data Scientist”
21. New Technology Opens Up New Possibilities To Executives
Artificial Intelligence & Automation
makes Machine Learning Affordable,
Pervasive and Inclusive
22. Poll the Audience
How do you primarily develop and deploy machine learning solutions
in your organization today?
1. Multiple, small data science teams
2. One, big enterprise data science team
3. Outsource to consulting
4. We use automated machine learning
5. We currently don’t use machine learning
23. Elements of Automated Machine Learning
Smart
● Accurate
● Appeal to experienced data scientists
● Control buttons are accessible to the users
Easy to Use
● Intuitive, fully automated workflow
● Needs minimum inputs but has guardrails
● Interpretable & transparent
● Deployment focused
24. A 10 min journey to Automated Machine
Learning (AML) using DataRobot Platform
can we predict which patient is coming back to
hospital within the first 30 days?
Demo
25. What capabilities for DataRobot on Cloudera?
HDFS ingest: DR can utilize data stored in HDFS directly
Hadoop Modeling: Train ML models on the Cloudera data nodes
directly
Hadoop scoring: Any model can then be deployed on Hadoop directly
Distributed (each node scores a data split)
Uses Spark
26. Cloudera/DataRobot Integration Details
DataRobot has the highest level of integration with Cloudera
Cloudera Parcels A few click to install DR in Cloudera
Manager!
Cloudera CSDs Can use all the functionalities of Cloudera
Manager (monitoring, resource mgmt…)
Kerberos / Sentry Secured authentication
YARN All the resources consumed by DataRobot
are managed by YARN
Spark DataRobot uses Spark for Hadoop scoring
28. Apache Spark Ecosystem with Spark ML lib
Spark MLlib API is available in Scala, Java, and Python programming
languages
29. Training from Cloudera and DataRobot
● Introduction to Machine Learning - Cloudera Training
https://www.cloudera.com/more/training/courses/intro-machine-learning.html
● Data Science for Executives - DataRobot Training
https://www.datarobot.com/education/for-executives/
● Machine Learning with DataRobot - DataRobot Training
https://www.datarobot.com/education/for-business-analysts/
30. Learn More & Contact Us
https://www.cloudera.com/solutions/insurance.html
Cloudera
Follow us: @Cloudera
mihaela@cloudera.com
Taneja Group Spark Market Adoption Report : LINK
DataRobot Overview: LINK
https://www.datarobot.com/go/insurance/
Follow us: @DataRobot
satadru@datarobot.com
DataRobot
Executive Briefing: LINK
The Machine Learning Renaissance: LINK
Register for Wrangle Conference: July 20, San Francisco
http://wrangleconf.com/