Breakout: Hadoop and the Operational Data Store

1© Cloudera, Inc. All rights reserved.
More Data in Less Time
Deploying an Operational Data Store with Cloudera

Trends in the Market
16 billion connected devices
generating more data
“It will soon be technically
feasible & affordable to record
& store everything…”
ELT drives up to 80% of
database capacity
Internet of Things Data Storage Costs Resource Intensive ELT
Trends Driving Change
Source: Forbes Source: New York Times Source: Syncsort

Customers are augmenting their
traditional architectures for
modern business needs.

Operational Data Store (ODS):
Ingesting, storing, and preparing data for
both operational and analytical use.
(AKA: Operational Data Warehouse., RDBMS, Storage)

ODS Use Cases
Offload resource intensive ETL
workloads from systems
Migrate old data and ELT
workloads off of EDW
Store old data online so analyst
can access historic data
ETL Offload EDW Optimization Active Archive

Goals of an Operational Data Store
Ingest Data Store DataPrepare Data
Enterprise Data Warehouse
ApplicationsData Sources
Structured
Unstructured
Ingest
Operational Data Store
Traditional Architecture
ServeELT
Archive
BI System
Modeling
Reporting
ETL
Storage #1
Storage #2
Storage N
Ingest
Process
Load

Challenges with a Traditional Architecture
1) Limited Data Ingest
Structured
Unstructured
Ingest
ServeELT
Archive
BI System
Modeling
Reporting
ETL
Storage #1
Storage #2
Storage N
Ingest
Process
Load
1

1) Limited Data Ingest 2) Inefficient Data Processing
Structured
Unstructured
Ingest
ServeELT
Archive
BI System
Modeling
Reporting
ETL
Storage #1
Storage #2
Storage N
Ingest
Process
Load
1
2
2

1) Limited Data Ingest 2) Inefficient Data Processing 3) Data Archived
Structured
Unstructured
Ingest
ServeELT
Archive
BI System
Modeling
Reporting
ETL
Storage #1
Storage #2
Storage N
Ingest
Process
Load
1
2
2
3

A New Way Forward
1) Ingest More Data
Structured
Unstructured
Modern Architecture
EDHIngest
Active
Structured Data
Serve
Serve
ELT
Archive
Load
1
ETL
BI System
Modeling
Reporting

A New Way Forward
1) Ingest More Data 2) Optimize Data Processing
Structured
Unstructured
Modern Architecture
EDHIngest
Active
Structured Data
Serve
Serve
ELT
Archive
Load
2
1
ETL
BI System
Modeling
Reporting

A New Way Forward
1) Ingest More Data 2) Optimize Data Processing 3) Automated Secure Archive
Structured
Unstructured
Modern Architecture
EDHIngest
Active
Structured Data
Serve
Serve
ELT
Archive
Load
2
31
ETL
BI System
Modeling
Reporting

RelayHealth Customer Story

About RelayHealth (A McKesson Business)
What does RelayHealth do-
RelayHealth is a financial solution of McKesson used to automate 2.4 billion financial transactions per year
200K Physicians, 2K Hospitals, 1.9K Payers/ Health Plans
Who is McKesson-
Largest healthcare solution company in the world with $103+ billion in revenue
Headquarters in San Francisco and established in 1833
32K employees

RelayHealth’s Objectives
Offload resource intensive ETL
workloads from systems
Migrate old data and ELT
workloads off of EDW
Store old data online so analyst
can access historic data
ETL Offload EDW Optimization Active Archive

The Pre-Hadoop Environment
1 Deleted & archived information
Challenges
OLTP
Claim
Submitters
Various
Applications
RDBMS
EDW
Reports
Archive
1
RelayHealth Transaction
BATCH Processing System

Challenges
OLTP
Claim
Submitters
Various
Applications
RDBMS
EDW
Reports
Archive
2 Batch wasn’t cutting it
1
2

Challenges
OLTP
Claim
Submitters
Various
Applications
RDBMS
EDW
Reports
Archive
2 Batch wasn’t cutting it
3 Application & report latency
1
3
3
2
3

RelayHealth’s Modern Hadoop Architecture
Active archive on Hadoop1
Improvements
Traditional BATCH Processing
Hadoop STREAM Processing
Process
Payer
Application
Reports
Spark
Streaming
Claim
Submitters
RelayHealth Transaction Processing System
Ingest Store Access
Kafka Hbase
Search
Spark
Modeling
1

Improvements
Process
Payer
Application
Reports
Spark
Streaming
Claim
Submitters
Ingest Store Access
Kafka Hbase
Search
Spark
Modeling
Stream & batch processing2
2
1

Improvements
Process
Payer
Application
Reports
Spark
Streaming
Claim
Submitters
Ingest Store Access
Kafka Hbase
Search
Spark
Modeling
Stream & batch processing2
Prepared for future use cases3
2
3
1

Business and Technical ROI
Technology ROI
Business ROI
1) Active archive and Navigator for HIPAA compliance
2) Prepared for future use cases
3) Data ingest goes from end of day to near real-time
1) Transaction processed in 20ms VS 1 hour prior
2) $250k in licensing and hardware savings per year
3) Greater flexibility with data ingest

Key Leanings
Crawl, walk, run
It takes time, start now
Lean on experts in the community

INSERT PARTNER SLIDES

Thank you

Breakout: Hadoop and the Operational Data Store

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Breakout: Hadoop and the Operational Data Store

Similar a Breakout: Hadoop and the Operational Data Store (20)

Más de Cloudera, Inc.

Más de Cloudera, Inc. (20)

Último

Último (20)

Breakout: Hadoop and the Operational Data Store

Notas del editor