SlideShare a Scribd company logo
1 of 80
Download to read offline
Cassandra nice use-cases and worst anti-patterns 
DuyHai DOAN, Technical Advocate 
@doanduyhai
Agenda! 
@doanduyhai 
2 
Anti-patterns 
• Queue-like designs 
• CQL null values 
• Intensive update on same column 
• Design around dynamic schema
Agenda! 
@doanduyhai 
3 
Nice use-cases 
• Rate-limiting 
• Anti Fraud 
• Account validation 
• Sensor data timeseries
Worst anti-patterns! 
Queue-like designs! 
CQL null! 
Intensive update on same column! 
Design around dynamic schema! 
!
Failure level! 
@doanduyhai 
5 
☠ 
☠☠ 
☠☠☠ 
☠☠☠☠
Queue-like designs! 
@doanduyhai 
6 
Adding new message ☞ 1 physical insert
Queue-like designs! 
@doanduyhai 
7 
Adding new message ☞ 1 physical insert 
Consuming message = deleting it ☞ 1 physical insert (tombstone)
Queue-like designs! 
@doanduyhai 
8 
Adding new message ☞ 1 physical insert 
Consuming message = deleting it ☞ 1 physical insert (tombstone) 
Transactional queue = re-inserting messages ☞ physical insert * <many>
Queue-like designs! 
FIFO queue 
@doanduyhai 
9 
A 
{ A }
Queue-like designs! 
FIFO queue 
@doanduyhai 
10 
A B 
{ A, B }
Queue-like designs! 
FIFO queue 
@doanduyhai 
11 
A B C 
{ A, B, C }
Queue-like designs! 
FIFO queue 
@doanduyhai 
12 
A B C A 
{ B, C }
Queue-like designs! 
FIFO queue 
@doanduyhai 
13 
A B C A D 
{ B, C, D }
Queue-like designs! 
FIFO queue 
@doanduyhai 
14 
A B C A D B 
{ C, D }
Queue-like designs! 
FIFO queue 
@doanduyhai 
15 
A B C A D B C 
{ D }
Queue-like designs! 
FIFO queue, worst case 
@doanduyhai 
16 
A A A A A A A A A A 
{ }
Failure level! 
@doanduyhai 
17 
☠☠☠
CQL null semantics! 
@doanduyhai 
18 
Reading null value means 
• value does not exist (has never bean created) 
• value deleted (tombstone) 
SELECT age FROM users WHERE login = ddoan; à NULL
CQL null semantics! 
@doanduyhai 
19 
Writing null means 
• delete value (creating tombstone) 
• even though it does not exist 
UPDATE users SET age = NULL WHERE login = ddoan;
CQL null semantics! 
@doanduyhai 
20 
Seen in production: prepared statement 
UPDATE users SET 
age = ?, 
… 
geo_location = ?, 
mood = ?, 
… 
WHERE login = ?;
CQL null semantics! 
@doanduyhai 
21 
Seen in production: bound statement 
preparedStatement.bind(33, …, null, null, null, …); 
null ☞ tombstone creation on each update … 
jdoe 
age name geo_loc mood status 
33 John DOE ý ý ý
Failure level! 
@doanduyhai 
22 
☠
Intensive update! 
@doanduyhai 
23 
Context 
• small start-up 
• cloud-based video recording & alarm 
• internet of things (sensor) 
• 10 updates/sec for some sensors
Intensive update on same column! 
@doanduyhai 
24 
Data model 
sensor_id 
value 
45.0034 
CREATE TABLE sensor_data ( 
sensor_id long, 
value double, 
PRIMARY KEY(sensor_id));
Intensive update on same column! 
UPDATE sensor_data SET value = 45.0034 WHERE sensor_id = …; 
UPDATE sensor_data SET value = 47.4182 WHERE sensor_id = …; 
UPDATE sensor_data SET value = 48.0300 WHERE sensor_id = …; 
@doanduyhai 
25 
Updates 
sensor_id 
value (t1) 
45.0034 
sensor_id 
value (t13) 
47.4182 
sensor_id 
value (t36) 
48.0300
Intensive update on same column! 
@doanduyhai 
26 
Read 
SELECT sensor_value from sensor_data WHERE sensor_id = …; 
read N physical columns, only 1 useful … 
sensor_id 
value (t1) 
45.0034 
sensor_id 
value (t13) 
47.4182 
sensor_id 
value (t36) 
48.0300
Intensive update on same column! 
@doanduyhai 
27 
Solution 1: leveled compaction! (if your I/O can keep up) 
sensor_id 
value (t1) 
45.0034 
sensor_id 
value (t13) 
47.4182 
sensor_id 
value (t36) 
48.0300 
sensor_id 
value (t36) 
48.0300
Intensive update on same column! 
@doanduyhai 
28 
Solution 2: reversed timeseries & DateTiered compaction strategy 
CREATE TABLE sensor_data ( 
sensor_id long, 
date timestamp, 
sensor_value double, 
PRIMARY KEY((sensor_id), date)) 
WITH CLUSTERING ORDER (date DESC);
Intensive update on same column! 
SELECT sensor_value FROM sensor_data WHERE sensor_id = … LIMIT 1; 
@doanduyhai 
29 
sensor_id 
date3(t3) 
date2(t2) 
date1(t1) 
Data cleaning by configuration (max_sstable_age_days) 
... 
48.0300 47.4182 45.0034 …
Failure level! 
@doanduyhai 
30 
☠☠
Design around dynamic schema! 
@doanduyhai 
31 
Customer emergency call 
• 3 nodes cluster almost full 
• impossible to scale out 
• 4th node in JOINING state for 1 week 
• disk space is filling up, production at risk!
Design around dynamic schema! 
@doanduyhai 
32 
After investigation 
• 4th node in JOINING state because streaming is stalled 
• NPE in logs
Design around dynamic schema! 
@doanduyhai 
33 
After investigation 
• 4th node in JOINING state because streaming is stalled 
• NPE in logs 
Cassandra source-code to the rescue
Design around dynamic schema! 
@doanduyhai 
34 
public class CompressedStreamReader extends StreamReader 
{ 
… 
@Override 
public SSTableWriter read(ReadableByteChannel channel) throws IOException 
{ 
… 
Pair<String, String> kscf = Schema.instance.getCF(cfId); 
ColumnFamilyStore cfs = Keyspace.open(kscf.left).getColumnFamilyStore(kscf.right); 
NPE here
Design around dynamic schema! 
@doanduyhai 
35 
The truth is 
• the devs dynamically drop & recreate table every day 
• dynamic schema is in the core of their design 
Example: 
DROP TABLE catalog_127_20140613; 
CREATE TABLE catalog_127_20140614( … );
Design around dynamic schema! 
@doanduyhai 
36 
Failure sequence 
n1 
n2 
n4 
n3 
catalog_x_y 
catalog_x_y 
catalog_x_y 
catalog_x_y 
4 1 
2 
3 
5 
6
Design around dynamic schema! 
@doanduyhai 
37 
Failure sequence 
n1 
n2 
n4 
n3 
catalog_x_y 
catalog_x_y 
catalog_x_y 
catalog_x_y 
4 1 
2 
3 
5 
6 
catalog_x_z 
catalog_x_z 
catalog_x_z 
catalog_x_z
Design around dynamic schema! 
@doanduyhai 
catalog_x_y ???? 
38 
Failure sequence 
n1 
n2 
n4 
n3 
4 1 
2 
3 
5 
6 
catalog_x_z 
catalog_x_z 
catalog_x_z 
catalog_x_z
Design around dynamic schema! 
@doanduyhai 
39 
Consequences 
• joining node got always stuck 
• à cannot extend cluster 
• 
à changing code takes time 
• 
à production in danger (no space left) 
• 
à sacrify analytics data to survive
Design around dynamic schema! 
@doanduyhai 
40 
Nutshell 
• dynamic schema change as normal operations is not recommended 
• concurrent schema AND topology change is an anti-pattern
Failure level! 
@doanduyhai 
41 
☠☠☠☠
! " 
! 
Q & R
Nice Examples! 
Rate limiting! 
Anti Fraud! 
Account Validation! 
Sensor Data Timeseries!
Rate limiting! 
@doanduyhai 
44 
Start-up company, reset password feature 
1) /password/reset 
2) SMS with token A0F83E63DB935465CE73DFE…. 
Phone number Random token 
3) /password/new/<token>/<password>
Rate limiting! 
@doanduyhai 
45 
Problem 1 
• account created with premium phone number
Rate limiting! 
@doanduyhai 
46 
Problem 1 
• account created with premium phone number 
• /password/reset x 100
Rate limiting! 
@doanduyhai 
47 
« money, money, money, give money, in the richman’s world » $$$
Rate limiting! 
@doanduyhai 
48 
Problem 2 
• massive hack
Rate limiting! 
@doanduyhai 
49 
Problem 2 
• massive hack 
• 106 /password/reset calls from few accounts
Rate limiting! 
@doanduyhai 
50 
Problem 2 
• massive hack 
• 106 /password/reset calls from few accounts 
• SMS messages are cheap
Rate limiting! 
@doanduyhai 
51 
Problem 2 
• ☞ but not at the 106/per user/per day scale
Rate limiting! 
@doanduyhai 
52 
Solution 
• premium phone number ☞ Google libphonenumber
Rate limiting! 
@doanduyhai 
53 
Solution 
• premium phone number ☞ Google libphonenumber 
• massive hack ☞ rate limiting with Cassandra
Cassandra Time To Live! 
@doanduyhai 
54 
Time to live 
• built-in feature 
• insert data with a TTL in sec 
• expires server-side automatically 
• ☞ use as sliding-window
Rate limiting in action! 
@doanduyhai 
55 
Implementation 
• threshold = max 3 reset password per sliding 24h
Rate limiting in action! 
@doanduyhai 
56 
Implementation 
• when /password/reset called 
• check threshold 
• reached ☞ error message/ignore 
• not reached ☞ log the attempt with TTL = 86400
Rate limiting 
demo
Anti Fraud! 
@doanduyhai 
58 
Real story 
• many special offers available 
• 30 mins international calls (50 countries) 
• unlimited land-line calls to 5 countries 
• …
Anti Fraud! 
@doanduyhai 
59 
Real story 
• each offer has a duration (week/month/year) 
• only one offer active at a time
Anti Fraud! 
@doanduyhai 
60 
Cassandra TTL 
• check for existing offer before 
SELECT count(*) FROM user_special_offer WHERE login = ‘jdoe’;
Anti Fraud! 
@doanduyhai 
61 
Cassandra TTL 
• then grant new offer 
INSERT INTO user_special_offer(login, offer_code, …) 
VALUES(‘jdoe’, ’30_mins_international’,…) 
USING TTL <offer_duration>;
Account Validation! 
@doanduyhai 
62 
Requirement 
• user creates new account 
• sends sms/email link with token to validate account 
• 10 days to validate
Account Validation! 
@doanduyhai 
63 
How to ? 
• create account with 10 days TTL 
INSERT INTO users(login, name, age) 
VALUES(‘jdoe’, ‘John DOE’, 33) 
USING TTL 864000;
Account Validation! 
@doanduyhai 
64 
How to ? 
• create random token for validation with 10 days TTL 
INSERT INTO account_validation(token, login, name, age) 
VALUES(‘A0F83E63DB935465CE73DFE…’, ‘jdoe’, ‘John DOE’, 33) 
USING TTL 864000;
Account Validation! 
@doanduyhai 
65 
On token validation 
• check token exist & retrieve user details 
SELECT login, name, age FROM account_validation 
WHERE token = ‘A0F83E63DB935465CE73DFE…’; 
• re-insert durably user details without TTL 
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
Sensor Data Timeseries! 
@doanduyhai 
66 
Requirements 
• lots of sensors (103 – 106) 
• medium to high insertion rate (0.1 – 10/secs) 
• keep good load balancing 
• fast read & write
Bucketing! 
@doanduyhai 
67 
CREATE TABLE sensor_data ( 
sensor_id text, 
date timestamp, 
raw_data blob, 
PRIMARY KEY(sensor_id, date)); 
sensor_id 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 …
Bucketing! 
@doanduyhai 
68 
Problems: 
• limit of 2.109 physical columns 
• bad load balancing (1 sensor = 1 node) 
• wide row spans over many files 
sensor_id 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 …
Bucketing! 
@doanduyhai 
69 
Idea: 
• composite partition key: sensor_id:date_bucket 
• tunable date granularity: per hour/per day/per month … 
CREATE TABLE sensor_data ( 
sensor_id text, 
date_bucket int, //format YYYYMMdd 
date timestamp, 
raw_data blob, 
PRIMARY KEY((sensor_id, date_bucket), date));
Bucketing! 
Idea: 
• composite partition key: sensor_id:date_bucket 
• tunable date granularity: per hour/per day/per month … 
@doanduyhai 
70 
sensor_id:2014091014 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 … 
sensor_id:2014091015 
date11 date12 date13 date14 … 
blob11 blob12 blob13 blob14 … 
Buckets
Bucketing! 
@doanduyhai 
71 
Advantage: 
• distribute load: 1 bucket = 1 node 
• limit partition width (max x columns per bucket) 
Buckets 
sensor_id:2014091014 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 … 
sensor_id:2014091015 
date11 date12 date13 date14 … 
blob11 blob12 blob13 blob14 …
Bucketing! 
@doanduyhai 
72 
But how can I select raw data between 14:45 and 15:10 ? 
14:45 à ? 
15:00 à 15:10 
sensor_id:2014091014 
date1 date2 date3 date4 … 
blob1 blob2 blob3 blob4 … 
sensor_id:2014091015 
date11 date12 date13 date14 … 
blob11 blob12 blob13 blob14 …
Bucketing! 
Solution 
• use IN clause on partition key component 
• with range condition on date column 
☞ date column should be monotonic function (increasing/decreasing) 
@doanduyhai 
73 
SELECT * FROM sensor_data WHERE sensor_id = xxx 
AND date_bucket IN (2014091014 , 2014091015) 
AND date >= ‘2014-09-10 14:45:00.000‘ 
AND date <= ‘2014-09-10 15:10:00.000‘
Bucketing Caveats! 
@doanduyhai 
74 
IN clause for #partition is not silver bullet ! 
• use scarcely 
• keep cardinality low (≤ 5) 
n1 
n2 
n3 
n4 
n5 
n6 
n7 
coordinator 
n8 
sensor_id:2014091014 
sensor_id:2014091015
Bucketing Caveats! 
@doanduyhai 
75 
IN clause for #partition is not silver bullet ! 
• use scarcely 
• keep cardinality low (≤ 5) 
• prefer // async queries 
• ease of query vs perf 
n1 
n2 
n3 
n4 
n5 
n6 
n7 
n8 
Async client 
sensor_id:2014091014 
sensor_id:2014091015
! " 
! 
Q & R
Cassandra developers! 
@doanduyhai 
77 
Rule n°1 
If you don’t know, ask for help 
(me, Cassandra ML, PlanetCassandra, stackoverflow, …) 
!
Cassandra developers! 
@doanduyhai 
78 
Rule n°2 
Do not blind-guess troubleshooting 
alone in production 
(ask for help, see rule n°1) 
!
Cassandra developers! 
@doanduyhai 
79 
Rule n°3 
Share with the community 
(your best use-cases … and worst failures) 
! 
http://planetcassandra.org/
Thank You 
@doanduyhai 
duy_hai.doan@datastax.com

More Related Content

What's hot

Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking ExplainedThomas Graf
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayDataStax Academy
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...DataStax
 
Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPThomas Graf
 
Linux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersBrendan Gregg
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance ToolsBrendan Gregg
 
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxData
 
Receive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-netReceive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-netYan Vugenfirer
 
Advanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreAdvanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreLukas Fittl
 
Vulkan introduction
Vulkan introductionVulkan introduction
Vulkan introductionJiahan Su
 
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...confluent
 
Multi-Region Cassandra Clusters
Multi-Region Cassandra ClustersMulti-Region Cassandra Clusters
Multi-Region Cassandra ClustersInstaclustr
 
카프카, 산전수전 노하우
카프카, 산전수전 노하우카프카, 산전수전 노하우
카프카, 산전수전 노하우if kakao
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsSpark Summit
 
Faster Container Image Distribution on a Variety of Tools with Lazy Pulling
Faster Container Image Distribution on a Variety of Tools with Lazy PullingFaster Container Image Distribution on a Variety of Tools with Lazy Pulling
Faster Container Image Distribution on a Variety of Tools with Lazy PullingKohei Tokunaga
 
Lightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache CassandraLightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache CassandraScyllaDB
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowDataWorks Summit
 
TiDB as an HTAP Database
TiDB as an HTAP DatabaseTiDB as an HTAP Database
TiDB as an HTAP DatabasePingCAP
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBrendan Gregg
 

What's hot (20)

Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
 
Client Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right WayClient Drivers and Cassandra, the Right Way
Client Drivers and Cassandra, the Right Way
 
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
 
Cilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDPCilium - Container Networking with BPF & XDP
Cilium - Container Networking with BPF & XDP
 
Linux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF Superpowers
 
Linux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
 
Linux Performance Tools
Linux Performance ToolsLinux Performance Tools
Linux Performance Tools
 
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOxInfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
 
Receive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-netReceive side scaling (RSS) with eBPF in QEMU and virtio-net
Receive side scaling (RSS) with eBPF in QEMU and virtio-net
 
Advanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & moreAdvanced pg_stat_statements: Filtering, Regression Testing & more
Advanced pg_stat_statements: Filtering, Regression Testing & more
 
Vulkan introduction
Vulkan introductionVulkan introduction
Vulkan introduction
 
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...
Zen and the Art of Streaming Joins - The What, When and Why (Nick Dearden, Co...
 
Multi-Region Cassandra Clusters
Multi-Region Cassandra ClustersMulti-Region Cassandra Clusters
Multi-Region Cassandra Clusters
 
카프카, 산전수전 노하우
카프카, 산전수전 노하우카프카, 산전수전 노하우
카프카, 산전수전 노하우
 
Top 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark ApplicationsTop 5 Mistakes When Writing Spark Applications
Top 5 Mistakes When Writing Spark Applications
 
Faster Container Image Distribution on a Variety of Tools with Lazy Pulling
Faster Container Image Distribution on a Variety of Tools with Lazy PullingFaster Container Image Distribution on a Variety of Tools with Lazy Pulling
Faster Container Image Distribution on a Variety of Tools with Lazy Pulling
 
Lightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache CassandraLightweight Transactions in Scylla versus Apache Cassandra
Lightweight Transactions in Scylla versus Apache Cassandra
 
The columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache ArrowThe columnar roadmap: Apache Parquet and Apache Arrow
The columnar roadmap: Apache Parquet and Apache Arrow
 
TiDB as an HTAP Database
TiDB as an HTAP DatabaseTiDB as an HTAP Database
TiDB as an HTAP Database
 
Blazing Performance with Flame Graphs
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
 

Viewers also liked

strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsMatthew Dennis
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-PatternsMatthew Dennis
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestDuyhai Doan
 
Datastax enterprise presentation
Datastax enterprise presentationDatastax enterprise presentation
Datastax enterprise presentationDuyhai Doan
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandraPatrick McFadin
 
Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"Ontico
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandraAxel Liljencrantz
 
Cassandra rapid prototyping with achilles
Cassandra rapid prototyping with achillesCassandra rapid prototyping with achilles
Cassandra rapid prototyping with achillesDuyhai Doan
 
Cassandra java libraries
Cassandra java librariesCassandra java libraries
Cassandra java librariesDuyhai Doan
 
Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Apekshit Sharma
 
Achilles presentation
Achilles presentationAchilles presentation
Achilles presentationDuyhai Doan
 
Cassandra Drivers and Tools
Cassandra Drivers and ToolsCassandra Drivers and Tools
Cassandra Drivers and ToolsDuyhai Doan
 
Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingMatthew Dennis
 
Effective cassandra development with achilles
Effective cassandra development with achillesEffective cassandra development with achilles
Effective cassandra development with achillesDuyhai Doan
 
Cassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisCassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisDuyhai Doan
 
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...NoSQLmatters
 
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...DataStax
 
DZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling WebinarDZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling WebinarMatthew Dennis
 

Viewers also liked (20)

strangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patternsstrangeloop 2012 apache cassandra anti patterns
strangeloop 2012 apache cassandra anti patterns
 
Cassandra Anti-Patterns
Cassandra Anti-PatternsCassandra Anti-Patterns
Cassandra Anti-Patterns
 
Cassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapestCassandra introduction apache con 2014 budapest
Cassandra introduction apache con 2014 budapest
 
Datastax enterprise presentation
Datastax enterprise presentationDatastax enterprise presentation
Datastax enterprise presentation
 
Advanced data modeling with apache cassandra
Advanced data modeling with apache cassandraAdvanced data modeling with apache cassandra
Advanced data modeling with apache cassandra
 
Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"Денис Нелюбин, "Тамтэк"
Денис Нелюбин, "Тамтэк"
 
Introduction to HBase
Introduction to HBaseIntroduction to HBase
Introduction to HBase
 
Cassandra summit 2013 how not to use cassandra
Cassandra summit 2013  how not to use cassandraCassandra summit 2013  how not to use cassandra
Cassandra summit 2013 how not to use cassandra
 
Cassandra rapid prototyping with achilles
Cassandra rapid prototyping with achillesCassandra rapid prototyping with achilles
Cassandra rapid prototyping with achilles
 
Cassandra java libraries
Cassandra java librariesCassandra java libraries
Cassandra java libraries
 
Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015Introduction to HBase - NoSqlNow2015
Introduction to HBase - NoSqlNow2015
 
Achilles presentation
Achilles presentationAchilles presentation
Achilles presentation
 
Cassandra Drivers and Tools
Cassandra Drivers and ToolsCassandra Drivers and Tools
Cassandra Drivers and Tools
 
Cassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data ModelingCassandra NYC 2011 Data Modeling
Cassandra NYC 2011 Data Modeling
 
Effective cassandra development with achilles
Effective cassandra development with achillesEffective cassandra development with achilles
Effective cassandra development with achilles
 
Cassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS ParisCassandra NodeJS driver & NodeJS Paris
Cassandra NodeJS driver & NodeJS Paris
 
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - ...
 
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
From Monolith to Microservices with Cassandra, Grpc, and Falcor (Luke Tillman...
 
DZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling WebinarDZone Cassandra Data Modeling Webinar
DZone Cassandra Data Modeling Webinar
 
Apache Cassandra and Go
Apache Cassandra and GoApache Cassandra and Go
Apache Cassandra and Go
 

Similar to Cassandra nice use cases and worst anti patterns

Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaCassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaDuyhai Doan
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data modelDuyhai Doan
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and dontsDuyhai Doan
 
KillrChat presentation
KillrChat presentationKillrChat presentation
KillrChat presentationDuyhai Doan
 
Cassandra introduction at FinishJUG
Cassandra introduction at FinishJUGCassandra introduction at FinishJUG
Cassandra introduction at FinishJUGDuyhai Doan
 
Cassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUGCassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUGDuyhai Doan
 
Cassandra drivers and libraries
Cassandra drivers and librariesCassandra drivers and libraries
Cassandra drivers and librariesDuyhai Doan
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jugDuyhai Doan
 
Cassandra data structures and algorithms
Cassandra data structures and algorithmsCassandra data structures and algorithms
Cassandra data structures and algorithmsDuyhai Doan
 
Libon cassandra summiteu2014
Libon cassandra summiteu2014Libon cassandra summiteu2014
Libon cassandra summiteu2014Duyhai Doan
 
Cassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUGCassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUGDuyhai Doan
 
KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)DataStax Academy
 
KillrChat Data Modeling
KillrChat Data ModelingKillrChat Data Modeling
KillrChat Data ModelingDuyhai Doan
 
Understanding hd wallets design and implementation
Understanding hd wallets  design and implementationUnderstanding hd wallets  design and implementation
Understanding hd wallets design and implementationArcBlock
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016Duyhai Doan
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valleyPatrick McFadin
 
Cassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyCassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyDataStax Academy
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search rideDuyhai Doan
 

Similar to Cassandra nice use cases and worst anti patterns (20)

Cassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelonaCassandra nice use cases and worst anti patterns no sql-matters barcelona
Cassandra nice use cases and worst anti patterns no sql-matters barcelona
 
Introduction to Cassandra & Data model
Introduction to Cassandra & Data modelIntroduction to Cassandra & Data model
Introduction to Cassandra & Data model
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Cassandra for the ops dos and donts
Cassandra for the ops   dos and dontsCassandra for the ops   dos and donts
Cassandra for the ops dos and donts
 
KillrChat presentation
KillrChat presentationKillrChat presentation
KillrChat presentation
 
Cassandra introduction at FinishJUG
Cassandra introduction at FinishJUGCassandra introduction at FinishJUG
Cassandra introduction at FinishJUG
 
Cassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUGCassandra introduction @ NantesJUG
Cassandra introduction @ NantesJUG
 
Cassandra drivers and libraries
Cassandra drivers and librariesCassandra drivers and libraries
Cassandra drivers and libraries
 
Cassandra introduction mars jug
Cassandra introduction mars jugCassandra introduction mars jug
Cassandra introduction mars jug
 
Cassandra data structures and algorithms
Cassandra data structures and algorithmsCassandra data structures and algorithms
Cassandra data structures and algorithms
 
Libon cassandra summiteu2014
Libon cassandra summiteu2014Libon cassandra summiteu2014
Libon cassandra summiteu2014
 
Cassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUGCassandra introduction @ ParisJUG
Cassandra introduction @ ParisJUG
 
KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)KillrChat: Building Your First Application in Apache Cassandra (English)
KillrChat: Building Your First Application in Apache Cassandra (English)
 
KillrChat Data Modeling
KillrChat Data ModelingKillrChat Data Modeling
KillrChat Data Modeling
 
Understanding hd wallets design and implementation
Understanding hd wallets  design and implementationUnderstanding hd wallets  design and implementation
Understanding hd wallets design and implementation
 
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016Sasi, cassandra on the full text search ride At  Voxxed Day Belgrade 2016
Sasi, cassandra on the full text search ride At Voxxed Day Belgrade 2016
 
Real data models of silicon valley
Real data models of silicon valleyReal data models of silicon valley
Real data models of silicon valley
 
Cassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon ValleyCassandra Summit 2014: Real Data Models of Silicon Valley
Cassandra Summit 2014: Real Data Models of Silicon Valley
 
Apache Cassandra & Data Modeling
Apache Cassandra & Data ModelingApache Cassandra & Data Modeling
Apache Cassandra & Data Modeling
 
Sasi, cassandra on full text search ride
Sasi, cassandra on full text search rideSasi, cassandra on full text search ride
Sasi, cassandra on full text search ride
 

More from Duyhai Doan

Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Duyhai Doan
 
Le futur d'apache cassandra
Le futur d'apache cassandraLe futur d'apache cassandra
Le futur d'apache cassandraDuyhai Doan
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplDuyhai Doan
 
Big data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysBig data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysDuyhai Doan
 
Datastax day 2016 introduction to apache cassandra
Datastax day 2016   introduction to apache cassandraDatastax day 2016   introduction to apache cassandra
Datastax day 2016 introduction to apache cassandraDuyhai Doan
 
Datastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDatastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDuyhai Doan
 
Apache cassandra in 2016
Apache cassandra in 2016Apache cassandra in 2016
Apache cassandra in 2016Duyhai Doan
 
Spark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotronSpark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotronDuyhai Doan
 
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016Duyhai Doan
 
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Duyhai Doan
 
Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Duyhai Doan
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016Duyhai Doan
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016Duyhai Doan
 
Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016Duyhai Doan
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016Duyhai Doan
 
Apache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemApache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemDuyhai Doan
 
Cassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsCassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsDuyhai Doan
 
Data stax academy
Data stax academyData stax academy
Data stax academyDuyhai Doan
 
Apache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystemApache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystemDuyhai Doan
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...Duyhai Doan
 

More from Duyhai Doan (20)

Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
Pourquoi Terraform n'est pas le bon outil pour les déploiements automatisés d...
 
Le futur d'apache cassandra
Le futur d'apache cassandraLe futur d'apache cassandra
Le futur d'apache cassandra
 
Big data 101 for beginners devoxxpl
Big data 101 for beginners devoxxplBig data 101 for beginners devoxxpl
Big data 101 for beginners devoxxpl
 
Big data 101 for beginners riga dev days
Big data 101 for beginners riga dev daysBig data 101 for beginners riga dev days
Big data 101 for beginners riga dev days
 
Datastax day 2016 introduction to apache cassandra
Datastax day 2016   introduction to apache cassandraDatastax day 2016   introduction to apache cassandra
Datastax day 2016 introduction to apache cassandra
 
Datastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basicsDatastax day 2016 : Cassandra data modeling basics
Datastax day 2016 : Cassandra data modeling basics
 
Apache cassandra in 2016
Apache cassandra in 2016Apache cassandra in 2016
Apache cassandra in 2016
 
Spark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotronSpark zeppelin-cassandra at synchrotron
Spark zeppelin-cassandra at synchrotron
 
Cassandra 3 new features @ Geecon Krakow 2016
Cassandra 3 new features  @ Geecon Krakow 2016Cassandra 3 new features  @ Geecon Krakow 2016
Cassandra 3 new features @ Geecon Krakow 2016
 
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016Algorithme distribués pour big data saison 2 @DevoxxFR 2016
Algorithme distribués pour big data saison 2 @DevoxxFR 2016
 
Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016Apache Zeppelin @DevoxxFR 2016
Apache Zeppelin @DevoxxFR 2016
 
Cassandra 3 new features 2016
Cassandra 3 new features 2016Cassandra 3 new features 2016
Cassandra 3 new features 2016
 
Cassandra introduction 2016
Cassandra introduction 2016Cassandra introduction 2016
Cassandra introduction 2016
 
Spark cassandra integration 2016
Spark cassandra integration 2016Spark cassandra integration 2016
Spark cassandra integration 2016
 
Spark Cassandra 2016
Spark Cassandra 2016Spark Cassandra 2016
Spark Cassandra 2016
 
Apache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystemApache zeppelin the missing component for the big data ecosystem
Apache zeppelin the missing component for the big data ecosystem
 
Cassandra UDF and Materialized Views
Cassandra UDF and Materialized ViewsCassandra UDF and Materialized Views
Cassandra UDF and Materialized Views
 
Data stax academy
Data stax academyData stax academy
Data stax academy
 
Apache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystemApache zeppelin, the missing component for the big data ecosystem
Apache zeppelin, the missing component for the big data ecosystem
 
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
Cassandra and Spark, closing the gap between no sql and analytics   codemotio...Cassandra and Spark, closing the gap between no sql and analytics   codemotio...
Cassandra and Spark, closing the gap between no sql and analytics codemotio...
 

Recently uploaded

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 

Recently uploaded (20)

A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 

Cassandra nice use cases and worst anti patterns

  • 1. Cassandra nice use-cases and worst anti-patterns DuyHai DOAN, Technical Advocate @doanduyhai
  • 2. Agenda! @doanduyhai 2 Anti-patterns • Queue-like designs • CQL null values • Intensive update on same column • Design around dynamic schema
  • 3. Agenda! @doanduyhai 3 Nice use-cases • Rate-limiting • Anti Fraud • Account validation • Sensor data timeseries
  • 4. Worst anti-patterns! Queue-like designs! CQL null! Intensive update on same column! Design around dynamic schema! !
  • 5. Failure level! @doanduyhai 5 ☠ ☠☠ ☠☠☠ ☠☠☠☠
  • 6. Queue-like designs! @doanduyhai 6 Adding new message ☞ 1 physical insert
  • 7. Queue-like designs! @doanduyhai 7 Adding new message ☞ 1 physical insert Consuming message = deleting it ☞ 1 physical insert (tombstone)
  • 8. Queue-like designs! @doanduyhai 8 Adding new message ☞ 1 physical insert Consuming message = deleting it ☞ 1 physical insert (tombstone) Transactional queue = re-inserting messages ☞ physical insert * <many>
  • 9. Queue-like designs! FIFO queue @doanduyhai 9 A { A }
  • 10. Queue-like designs! FIFO queue @doanduyhai 10 A B { A, B }
  • 11. Queue-like designs! FIFO queue @doanduyhai 11 A B C { A, B, C }
  • 12. Queue-like designs! FIFO queue @doanduyhai 12 A B C A { B, C }
  • 13. Queue-like designs! FIFO queue @doanduyhai 13 A B C A D { B, C, D }
  • 14. Queue-like designs! FIFO queue @doanduyhai 14 A B C A D B { C, D }
  • 15. Queue-like designs! FIFO queue @doanduyhai 15 A B C A D B C { D }
  • 16. Queue-like designs! FIFO queue, worst case @doanduyhai 16 A A A A A A A A A A { }
  • 18. CQL null semantics! @doanduyhai 18 Reading null value means • value does not exist (has never bean created) • value deleted (tombstone) SELECT age FROM users WHERE login = ddoan; à NULL
  • 19. CQL null semantics! @doanduyhai 19 Writing null means • delete value (creating tombstone) • even though it does not exist UPDATE users SET age = NULL WHERE login = ddoan;
  • 20. CQL null semantics! @doanduyhai 20 Seen in production: prepared statement UPDATE users SET age = ?, … geo_location = ?, mood = ?, … WHERE login = ?;
  • 21. CQL null semantics! @doanduyhai 21 Seen in production: bound statement preparedStatement.bind(33, …, null, null, null, …); null ☞ tombstone creation on each update … jdoe age name geo_loc mood status 33 John DOE ý ý ý
  • 23. Intensive update! @doanduyhai 23 Context • small start-up • cloud-based video recording & alarm • internet of things (sensor) • 10 updates/sec for some sensors
  • 24. Intensive update on same column! @doanduyhai 24 Data model sensor_id value 45.0034 CREATE TABLE sensor_data ( sensor_id long, value double, PRIMARY KEY(sensor_id));
  • 25. Intensive update on same column! UPDATE sensor_data SET value = 45.0034 WHERE sensor_id = …; UPDATE sensor_data SET value = 47.4182 WHERE sensor_id = …; UPDATE sensor_data SET value = 48.0300 WHERE sensor_id = …; @doanduyhai 25 Updates sensor_id value (t1) 45.0034 sensor_id value (t13) 47.4182 sensor_id value (t36) 48.0300
  • 26. Intensive update on same column! @doanduyhai 26 Read SELECT sensor_value from sensor_data WHERE sensor_id = …; read N physical columns, only 1 useful … sensor_id value (t1) 45.0034 sensor_id value (t13) 47.4182 sensor_id value (t36) 48.0300
  • 27. Intensive update on same column! @doanduyhai 27 Solution 1: leveled compaction! (if your I/O can keep up) sensor_id value (t1) 45.0034 sensor_id value (t13) 47.4182 sensor_id value (t36) 48.0300 sensor_id value (t36) 48.0300
  • 28. Intensive update on same column! @doanduyhai 28 Solution 2: reversed timeseries & DateTiered compaction strategy CREATE TABLE sensor_data ( sensor_id long, date timestamp, sensor_value double, PRIMARY KEY((sensor_id), date)) WITH CLUSTERING ORDER (date DESC);
  • 29. Intensive update on same column! SELECT sensor_value FROM sensor_data WHERE sensor_id = … LIMIT 1; @doanduyhai 29 sensor_id date3(t3) date2(t2) date1(t1) Data cleaning by configuration (max_sstable_age_days) ... 48.0300 47.4182 45.0034 …
  • 31. Design around dynamic schema! @doanduyhai 31 Customer emergency call • 3 nodes cluster almost full • impossible to scale out • 4th node in JOINING state for 1 week • disk space is filling up, production at risk!
  • 32. Design around dynamic schema! @doanduyhai 32 After investigation • 4th node in JOINING state because streaming is stalled • NPE in logs
  • 33. Design around dynamic schema! @doanduyhai 33 After investigation • 4th node in JOINING state because streaming is stalled • NPE in logs Cassandra source-code to the rescue
  • 34. Design around dynamic schema! @doanduyhai 34 public class CompressedStreamReader extends StreamReader { … @Override public SSTableWriter read(ReadableByteChannel channel) throws IOException { … Pair<String, String> kscf = Schema.instance.getCF(cfId); ColumnFamilyStore cfs = Keyspace.open(kscf.left).getColumnFamilyStore(kscf.right); NPE here
  • 35. Design around dynamic schema! @doanduyhai 35 The truth is • the devs dynamically drop & recreate table every day • dynamic schema is in the core of their design Example: DROP TABLE catalog_127_20140613; CREATE TABLE catalog_127_20140614( … );
  • 36. Design around dynamic schema! @doanduyhai 36 Failure sequence n1 n2 n4 n3 catalog_x_y catalog_x_y catalog_x_y catalog_x_y 4 1 2 3 5 6
  • 37. Design around dynamic schema! @doanduyhai 37 Failure sequence n1 n2 n4 n3 catalog_x_y catalog_x_y catalog_x_y catalog_x_y 4 1 2 3 5 6 catalog_x_z catalog_x_z catalog_x_z catalog_x_z
  • 38. Design around dynamic schema! @doanduyhai catalog_x_y ???? 38 Failure sequence n1 n2 n4 n3 4 1 2 3 5 6 catalog_x_z catalog_x_z catalog_x_z catalog_x_z
  • 39. Design around dynamic schema! @doanduyhai 39 Consequences • joining node got always stuck • à cannot extend cluster • à changing code takes time • à production in danger (no space left) • à sacrify analytics data to survive
  • 40. Design around dynamic schema! @doanduyhai 40 Nutshell • dynamic schema change as normal operations is not recommended • concurrent schema AND topology change is an anti-pattern
  • 41. Failure level! @doanduyhai 41 ☠☠☠☠
  • 42. ! " ! Q & R
  • 43. Nice Examples! Rate limiting! Anti Fraud! Account Validation! Sensor Data Timeseries!
  • 44. Rate limiting! @doanduyhai 44 Start-up company, reset password feature 1) /password/reset 2) SMS with token A0F83E63DB935465CE73DFE…. Phone number Random token 3) /password/new/<token>/<password>
  • 45. Rate limiting! @doanduyhai 45 Problem 1 • account created with premium phone number
  • 46. Rate limiting! @doanduyhai 46 Problem 1 • account created with premium phone number • /password/reset x 100
  • 47. Rate limiting! @doanduyhai 47 « money, money, money, give money, in the richman’s world » $$$
  • 48. Rate limiting! @doanduyhai 48 Problem 2 • massive hack
  • 49. Rate limiting! @doanduyhai 49 Problem 2 • massive hack • 106 /password/reset calls from few accounts
  • 50. Rate limiting! @doanduyhai 50 Problem 2 • massive hack • 106 /password/reset calls from few accounts • SMS messages are cheap
  • 51. Rate limiting! @doanduyhai 51 Problem 2 • ☞ but not at the 106/per user/per day scale
  • 52. Rate limiting! @doanduyhai 52 Solution • premium phone number ☞ Google libphonenumber
  • 53. Rate limiting! @doanduyhai 53 Solution • premium phone number ☞ Google libphonenumber • massive hack ☞ rate limiting with Cassandra
  • 54. Cassandra Time To Live! @doanduyhai 54 Time to live • built-in feature • insert data with a TTL in sec • expires server-side automatically • ☞ use as sliding-window
  • 55. Rate limiting in action! @doanduyhai 55 Implementation • threshold = max 3 reset password per sliding 24h
  • 56. Rate limiting in action! @doanduyhai 56 Implementation • when /password/reset called • check threshold • reached ☞ error message/ignore • not reached ☞ log the attempt with TTL = 86400
  • 58. Anti Fraud! @doanduyhai 58 Real story • many special offers available • 30 mins international calls (50 countries) • unlimited land-line calls to 5 countries • …
  • 59. Anti Fraud! @doanduyhai 59 Real story • each offer has a duration (week/month/year) • only one offer active at a time
  • 60. Anti Fraud! @doanduyhai 60 Cassandra TTL • check for existing offer before SELECT count(*) FROM user_special_offer WHERE login = ‘jdoe’;
  • 61. Anti Fraud! @doanduyhai 61 Cassandra TTL • then grant new offer INSERT INTO user_special_offer(login, offer_code, …) VALUES(‘jdoe’, ’30_mins_international’,…) USING TTL <offer_duration>;
  • 62. Account Validation! @doanduyhai 62 Requirement • user creates new account • sends sms/email link with token to validate account • 10 days to validate
  • 63. Account Validation! @doanduyhai 63 How to ? • create account with 10 days TTL INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33) USING TTL 864000;
  • 64. Account Validation! @doanduyhai 64 How to ? • create random token for validation with 10 days TTL INSERT INTO account_validation(token, login, name, age) VALUES(‘A0F83E63DB935465CE73DFE…’, ‘jdoe’, ‘John DOE’, 33) USING TTL 864000;
  • 65. Account Validation! @doanduyhai 65 On token validation • check token exist & retrieve user details SELECT login, name, age FROM account_validation WHERE token = ‘A0F83E63DB935465CE73DFE…’; • re-insert durably user details without TTL INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
  • 66. Sensor Data Timeseries! @doanduyhai 66 Requirements • lots of sensors (103 – 106) • medium to high insertion rate (0.1 – 10/secs) • keep good load balancing • fast read & write
  • 67. Bucketing! @doanduyhai 67 CREATE TABLE sensor_data ( sensor_id text, date timestamp, raw_data blob, PRIMARY KEY(sensor_id, date)); sensor_id date1 date2 date3 date4 … blob1 blob2 blob3 blob4 …
  • 68. Bucketing! @doanduyhai 68 Problems: • limit of 2.109 physical columns • bad load balancing (1 sensor = 1 node) • wide row spans over many files sensor_id date1 date2 date3 date4 … blob1 blob2 blob3 blob4 …
  • 69. Bucketing! @doanduyhai 69 Idea: • composite partition key: sensor_id:date_bucket • tunable date granularity: per hour/per day/per month … CREATE TABLE sensor_data ( sensor_id text, date_bucket int, //format YYYYMMdd date timestamp, raw_data blob, PRIMARY KEY((sensor_id, date_bucket), date));
  • 70. Bucketing! Idea: • composite partition key: sensor_id:date_bucket • tunable date granularity: per hour/per day/per month … @doanduyhai 70 sensor_id:2014091014 date1 date2 date3 date4 … blob1 blob2 blob3 blob4 … sensor_id:2014091015 date11 date12 date13 date14 … blob11 blob12 blob13 blob14 … Buckets
  • 71. Bucketing! @doanduyhai 71 Advantage: • distribute load: 1 bucket = 1 node • limit partition width (max x columns per bucket) Buckets sensor_id:2014091014 date1 date2 date3 date4 … blob1 blob2 blob3 blob4 … sensor_id:2014091015 date11 date12 date13 date14 … blob11 blob12 blob13 blob14 …
  • 72. Bucketing! @doanduyhai 72 But how can I select raw data between 14:45 and 15:10 ? 14:45 à ? 15:00 à 15:10 sensor_id:2014091014 date1 date2 date3 date4 … blob1 blob2 blob3 blob4 … sensor_id:2014091015 date11 date12 date13 date14 … blob11 blob12 blob13 blob14 …
  • 73. Bucketing! Solution • use IN clause on partition key component • with range condition on date column ☞ date column should be monotonic function (increasing/decreasing) @doanduyhai 73 SELECT * FROM sensor_data WHERE sensor_id = xxx AND date_bucket IN (2014091014 , 2014091015) AND date >= ‘2014-09-10 14:45:00.000‘ AND date <= ‘2014-09-10 15:10:00.000‘
  • 74. Bucketing Caveats! @doanduyhai 74 IN clause for #partition is not silver bullet ! • use scarcely • keep cardinality low (≤ 5) n1 n2 n3 n4 n5 n6 n7 coordinator n8 sensor_id:2014091014 sensor_id:2014091015
  • 75. Bucketing Caveats! @doanduyhai 75 IN clause for #partition is not silver bullet ! • use scarcely • keep cardinality low (≤ 5) • prefer // async queries • ease of query vs perf n1 n2 n3 n4 n5 n6 n7 n8 Async client sensor_id:2014091014 sensor_id:2014091015
  • 76. ! " ! Q & R
  • 77. Cassandra developers! @doanduyhai 77 Rule n°1 If you don’t know, ask for help (me, Cassandra ML, PlanetCassandra, stackoverflow, …) !
  • 78. Cassandra developers! @doanduyhai 78 Rule n°2 Do not blind-guess troubleshooting alone in production (ask for help, see rule n°1) !
  • 79. Cassandra developers! @doanduyhai 79 Rule n°3 Share with the community (your best use-cases … and worst failures) ! http://planetcassandra.org/
  • 80. Thank You @doanduyhai duy_hai.doan@datastax.com