The document discusses Netflix's distributed impression store that captures user impressions and recommendations at scale. Billions of entries are stored daily to track what content users have seen. An exponential moving average model is used to represent impression counts over multiple time windows in a compact way. The data is stored in a distributed cache-like system that uses memory and external storage for scale and persistence. The system replicates data across regions for availability.
10. ● Raw impressions:
○ Large dataset (PBs every day)
○ 10s of trillions of rows
○ $$$$
● Lack of signal:
○ Too noisy
○ Custom aggregation - expensive
Storage Philosophy
11.
12. ● Impression fatigue , Familiarity effect
● Under vs over Impressing
● Use in Recommendations - Score Online
Goals
13. ● Want: I want all impressions
● Need:
○ Meaningful way to capture impressions
○ (W/O storing full impressions)
● Idea:
○ EMA Counts # of times - <User, Video> over
multiple decay windows.
How many times has a show been recommended
to you?
14. ● Representation:
○ Exponential Moving Average (EMA) Score
○ Rate of impressions over a given window
○ EMA score over Windows:
■ <user, video, location> => <last_seen_ts, [1d, 1w, 1m, 6m, 1y]>
■ E.g <a, narcos, toprow> => <today, [0.9, 0.2, 0, 0, 0]>
● Benefits:
○ Better Signal
○ Memory Footprint
○ Extensible
Data model Definition - EMA
25. Client Side Reading (get)
us-west-2a us-west-2cus-west-2b
Client
Primary Secondary
26. ● In Memory:
○ $$$$
○ No persistence
○ Lack of consistency - Node crash etc
● Layered Storage?
But wait...
27. ● Scenario:
○ 10% of active items => 90% of hits
○ Large values eat up RAM
● Philosophy:
○ LRU <k,v> on disk; Index in RAM
○ MRU key + value in RAM
○ Values => NVMe; SSD
Ext Storage
34. ● Client Side
○ Writes: All replicas in a region
○ Compression: Gzip
○ Reads:
■ Quorum - Latches
■ Consistency All
● Server Side
○ Writes: Cross regions [replication]
○ Reads: Basic read repairs [Compare & Set]
○ Storage: Mem & External
Client & Server Responsibilities
35. Backup & Restore Architecture
Cache Warmer
(Spark)
Application
Client Library
Client
Control
S3
Data Flow
Metadata Flow
Control Flow