Netflix - Realtime Impression Store

Nitin Sharma
Real-Time Impression Store @ Netflix
Sep 2018

● Recommendations @ Netflix
● Distributed Impression Store
● Distributed Processing Infrastructure
Agenda

● 129M+ active members
● 190+ countries
● Increasing launch of originals
● 2 Trillion kafka events/day
Recommendations @Scale

● Signals:
○ Impressions, Plays, View History
○ Searches, My Lists etc.
● Compute:
○ Offline Compute
○ Online Compute
Netflix Recommendation Signals

● Raw impressions:
○ Large dataset (PBs every day)
○ 10s of trillions of rows
○ $$$$
● Lack of signal:
○ Too noisy
○ Custom aggregation - expensive
Storage Philosophy

● Impression fatigue , Familiarity effect
● Under vs over Impressing
● Use in Recommendations - Score Online
Goals

● Want: I want all impressions
● Need:
○ Meaningful way to capture impressions
○ (W/O storing full impressions)
● Idea:
○ EMA Counts # of times - <User, Video> over
multiple decay windows.
How many times has a show been recommended
to you?

● Representation:
○ Exponential Moving Average (EMA) Score
○ Rate of impressions over a given window
○ EMA score over Windows:
■ <user, video, location> => <last_seen_ts, [1d, 1w, 1m, 6m, 1y]>
■ E.g <a, narcos, toprow> => <today, [0.9, 0.2, 0, 0, 0]>
● Benefits:
○ Better Signal
○ Memory Footprint
○ Extensible
Data model Definition - EMA

Scale & SLA
Dimension Attribute Scale
Storage
Videos * locations * users ~600B + entries
Raw Data size 150+TB
Indexing Updates
Bulk 600B+ entries < 1.5 hours
Real time 50M entries < 5mins
Query 99th Percentile Latency < 10s of ms
Patterns Read/Update/Write
Operations
Cost $$ >> $$$$
Availability 99.99% [~4m downtime/30
days]

Observation
Attributes Stringent Flexible
Predictably faster Reads
Bulk + RT writes
Strict Schema
Secondary Index
Easy Operations
Ser/Deser at Read (aka
blob storage)

Cache with props of a
DataStore?

● Format: <K,V> storage & Schema-less
● Underlying: Memcached + Cross Region support
● Storage: In memory + Ext Storage
Evcache

Architecture
Server
EVCar
Application
Client Library
Client

Client Side Writing (set, delete, add, etc.)
us-west-2a us-west-2cus-west-2b
ClientClient Client

Client Side Reading (get)
us-west-2a us-west-2cus-west-2b
Client
Primary Secondary

● In Memory:
○ $$$$
○ No persistence
○ Lack of consistency - Node crash etc
● Layered Storage?
But wait...

● Scenario:
○ 10% of active items => 90% of hits
○ Large values eat up RAM
● Philosophy:
○ LRU <k,v> on disk; Index in RAM
○ MRU key + value in RAM
○ Values => NVMe; SSD
Ext Storage

Ext Storage
● Effects:
○ Reduce overall RAM
○ Reduce Server Count
○ Increase overall Cache size

Cost : 1 TB of storage space
EVCache - RAM Only
20 r4.2xlarge
EVCache - Flash
1 i3.xlarge
Network Bandwidth : ~40 Gbps Network Bandwidth : ~2.5 Gbps

● Cross Region Replication
● Backup & Recovery
Data Store (ish) Features

Region BRegion A
APP APP
Repl Proxy
Repl Relay
1 mutate
2 send
metadata
3 poll msg
5
https send
m
sg
6 mutate4
get data
for set
Kafka Repl Relay Kafka
Repl Proxy
Cross-Region Replication
7 read

● Cross Region:
○ Relay: Event forward through kafka
○ Proxy: Event receiver and mutate cache
■ Stateless
■ Easy Throttle
● CAS:
■ Client-> Replica
■ Client -> Kafka
■ Write Retries (by proxy) until replica is consistent.
Characteristics

● Client Side
○ Writes: All replicas in a region
○ Compression: Gzip
○ Reads:
■ Quorum - Latches
■ Consistency All
● Server Side
○ Writes: Cross regions [replication]
○ Reads: Basic read repairs [Compare & Set]
○ Storage: Mem & External
Client & Server Responsibilities

Backup & Restore Architecture
Cache Warmer
(Spark)
Application
Client Library
Client
Control
S3
Data Flow
Metadata Flow
Control Flow

● Protobuf
● EMA Payload:
○ (last_seen_ts, Float[1d, 1w, 1m, 6m, 1y])
● Overall:
○ Map<VideoId,Map<locationId,EMA>>
EMA Representation

Client Side Write Trade-offs
ClientClient
● Writes:
○ Read->De-ser -> Update -> Ser -> Write
● Schema evolution

Batch Indexing Infrastructure (V1)
us-east-1 us-west
eu-west
Precompute/Live
Compute

RT Indexing Infrastructure (V2)
Precompute/Live
Compute
Rocksdb

● Engine: Apache Flink
● Processing Semantics:
○ Exactly once
○ Late arriving events
● RocksDB:
○ Zero data loss:
■ Failures/Crashes - Checkpoints/Savepoints
○ State Management:
■ Application state
RT Indexing Infrastructure

Chaos Engineering
● Region Failovers
● Crash Proof ?:
○ Replicas
○ Tail Latency tests
● Cross Region Replication:
○ Kafka:
■ 2-3x data
○ Low Lag:
■ Scale up Replay Proxies

Tiered Storage:
○ Memory
○ SSD/Nvme
○ EBS (Nitro based)
■ 80,000 IOPS
■ 14Gbps
Future Work

https://www.linkedin.com/in/knitinsharma/
nitins@netflix.com
Speaker Info

Netflix - Realtime Impression Store

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Netflix - Realtime Impression Store

Similar a Netflix - Realtime Impression Store (20)

Último

Último (20)

Netflix - Realtime Impression Store