Scylla Summit 2017: SMF: The Fastest RPC in the West

PRESENTATION TITLE ON ONE LINE
AND ON TWO LINES
First and last name
Position, company
smf
the fastest RPC fmwrk. in the west
Principal Engineer, Platform Engineering - Akamai Technologies
Alexander Gallego

AND ON TWO LINES
First and last name
Position, company
Alexander Gallego
2
Principal Engineer @ Akamai - Platform Group
Ex CTO / founder of Concord.io - A distributed
stream processor written in C++ atop Apache
Mesos (Now part of Akamai)
First employee, engineer for Yieldmo.com (ad-tech)
startup in NYC
maintainer of smf: github.com/senior7515/smf

AND ON TWO LINES
First and last name
Position, company
background

AND ON TWO LINES
First and last name
Position, company
Can we do transactional streaming?
▪ At Concord.io, I worked on a streaming platform
o Can we do transactional writes (3x replication - even if in memory)
• Can we do it with low latency and high throughput?
– double digit ms *tail* latency at 1024 batches?
▪ Fastest open source queue did 150ms p90 and 2secs p9999
o Unpredictable JVM spikes -
• Spark once stalled for 4 seconds reading from Kafka - couldn’t replicate.
o Concord was in C++ - we wanted predictability
4

AND ON TWO LINES
First and last name
Position, company
Can we do better?
5

AND ON TWO LINES
First and last name
Position, company
How about this! - per node overhead
6
7us p90 latency
26us p100
latency
8us p99 latency

AND ON TWO LINES
First and last name
Position, company
How about this! - per node overhead
7
Read Socket RPC Parsing
Method
Execution
Flush Socket
60 byte payload + 20 bytes of TCP - with full type serialization!
p90 = 7 microseconds, p99 = 8 microseconds, p100 = 26 microseconds
p99=8us

AND ON TWO LINES
First and last name
Position, company
smf

AND ON TWO LINES
First and last name
Position, company
smf RPC
▪ Built for microsecond tail latencies
▪ Atop seastar::future<>s
▪ IDL Compiler/CodeGen - using Google Flatbuffers’ IDL
▪ Multi-language compatibility
▪ Small - 16 byte overhead (with rich types, headers, compression,etc)
… it’s like gRPC / Thrift / Cap n’ Proto - for microsecond latencies.
9

AND ON TWO LINES
First and last name
Position, company
IDL

AND ON TWO LINES
First and last name
Position, company
smf
11
namespace smf_gen.demo;
table Request {
name: string;
}
table Response {
name: string;
}
rpc_service SmfStorage {
Get(Request):Response;
}

AND ON TWO LINES
First and last name
Position, company
smf
12
table Request {
name: string;
}
table Response {
name: string;
}
}
input

AND ON TWO LINES
First and last name
Position, company
smf
13
table Request {
name: string;
}
table Response {
name: string;
}
}
output
input

AND ON TWO LINES
First and last name
Position, company
smf
14
table Request {
name: string;
}
table Response {
name: string;
}
}
Service Definition
output
input

AND ON TWO LINES
First and last name
Position, company
smf
15
table Request {
name: string;
}
table Response {
name: string;
}
}
smf_gen --filename demo_service.fbs
Service Definition
output
input

AND ON TWO LINES
First and last name
Position, company
Demo

AND ON TWO LINES
First and last name
Position, company
smf::rpc_client

AND ON TWO LINES
First and last name
Position, company
smf
18
smf::rpc_typed_envelope<Request> req;
req.data->name = "Hello, smf-world!";
auto client = SmfStorageClient::make_shared("127.0.0.1",2121);
client->Get(req.serialize_data()).then([ ](auto reply) {
std::cout << reply->name() << std::endl;
});

AND ON TWO LINES
First and last name
Position, company
smf
19
});
data to send

AND ON TWO LINES
First and last name
Position, company
smf
20
});
data to send
actual socket
seastar::shared_ptr<T>
non-thread safe

AND ON TWO LINES
First and last name
Position, company
smf
21
});
data to send
actual socket
seastar::shared_ptr<T>
non-thread safe
Method to call

AND ON TWO LINES
First and last name
Position, company
smf::rpc_server

AND ON TWO LINES
First and last name
Position, company
smf
23
class storage_service : public SmfStorage {
virtual seastar::future<rpc_typed_envelope<Response>>
Get(rpc_recv_typed_context<Request> rec) final {
rpc_typed_envelope<Response> data;
data.data->name = "Hello, cruel world!";
data.envelope.set_status(200);
return make_ready_future<decltype(data)>(std::move(data));
}
};

AND ON TWO LINES
First and last name
Position, company
smf
24
}
};
code-gen’ed service

AND ON TWO LINES
First and last name
Position, company
smf
25
}
};
Method

AND ON TWO LINES
First and last name
Position, company
smf
26
}
};
Method
return data

AND ON TWO LINES
First and last name
Position, company
smf::rpc_filter

AND ON TWO LINES
First and last name
Position, company
smf
28

AND ON TWO LINES
First and last name
Position, company
smf
29
template <typename T>
struct rpc_filter {
seastar::future<T> operator()(T t);
};

AND ON TWO LINES
First and last name
Position, company
smf
30
struct zstd_compression_filter : rpc_filter<rpc_envelope> {
explicit zstd_compression_filter(uint32_t min_size)
: min_compression_size(min_size) {}
seastar::future<rpc_envelope> operator()(rpc_envelope &&e);
const uint32_t min_compression_size;
};

AND ON TWO LINES
First and last name
Position, company
smf
31
// add it to your clients
client->outgoing_filters().push_back(
smf::zstd_compression_filter(1000));
// add it to your servers
using zstd_t = smf::zstd_compression_filter;
return rpc.invoke_on_all(
&smf::rpc_server::register_outgoming_filter<zstd_t>,1000);

AND ON TWO LINES
First and last name
Position, company
smf
32
static thread_local auto incoming_stage =
seastar::make_execution_stage("smf::incoming",
&rpc_client::apply_incoming_filters);
static thread_local auto outgoing_stage =
seastar::make_execution_stage("smf::outgoing",
&rpc_client::apply_outgoing_filters);
SEDA-pipelined

AND ON TWO LINES
First and last name
Position, company
request anatomy

AND ON TWO LINES
First and last name
Position, company
smf request anatomy
34
/// total = 128bits == 16bytes
MANUALLY_ALIGNED_STRUCT(4) header FLATBUFFERS_FINAL_CLASS {
int8_t compression_;
int8_t bitflags_;
uint16_t session_;
uint32_t size_;
uint32_t checksum_;
uint32_t meta_;
};
STRUCT_END(header, 16);

AND ON TWO LINES
First and last name
Position, company
smf request anatomy
35
int8_t bitflags_;
uint16_t session_;
uint32_t size_;
uint32_t checksum_;
uint32_t meta_;
};
- Turn off padding by compiler.
- Enforce layout.
- Store everything in little endian
- X-lang, X-platform compat
- noop on most platforms

AND ON TWO LINES
First and last name
Position, company
smf request anatomy
36
int8_t bitflags_;
uint16_t session_;
uint32_t size_;
uint32_t checksum_;
uint32_t meta_;
};
zstd, lz4

AND ON TWO LINES
First and last name
Position, company
smf request anatomy
37
int8_t bitflags_;
uint16_t session_;
uint32_t size_;
uint32_t checksum_;
uint32_t meta_;
};
zstd, lz4
headers?

AND ON TWO LINES
First and last name
Position, company
smf request anatomy
38
int8_t bitflags_;
uint16_t session_;
uint32_t size_;
uint32_t checksum_;
uint32_t meta_;
};
zstd, lz4
headers?
max # of concurrent requests per client

AND ON TWO LINES
First and last name
Position, company
smf request anatomy
39
int8_t bitflags_;
uint16_t session_;
uint32_t size_;
uint32_t checksum_;
uint32_t meta_;
};
zstd, lz4
headers?
xxhash32 - very fast! 5.4GB/s

AND ON TWO LINES
First and last name
Position, company
smf request anatomy
40
int8_t bitflags_;
uint16_t session_;
uint32_t size_;
uint32_t checksum_;
uint32_t meta_;
};
zstd, lz4
headers?
xxhash32 - very fast! 5.4GB/s
request_id or status (response) code

AND ON TWO LINES
First and last name
Position, company
smf Code Gen’d XOR id
41
auto fqn = fully_qualified_name;
service_id = hash( fqn(service_name) )
method_id = hash( ∀ fqn(x) input_args_types,
∀ fqn(x) output_args_types,
fqn(method_name),
separator = “:”)
rpc_dispatch_id = service_id ^ method_id;

AND ON TWO LINES
First and last name
Position, company
42
/// RequestID: 212494116 ^ 1719559449
/// ServiceID: 212494116
/// MethodID: 1719559449
future<smf::rpc_recv_typed_context<Response>>
Get(smf::rpc_envelope e) {
e.set_request_id(212494116, 1719559449);
return send<smf_gen::demo::Response>(std::move(e));
}
Method ID

AND ON TWO LINES
First and last name
Position, company
43
handles.emplace_back(
"Get", 1719559449,
[this](smf::rpc_recv_context c) {
using req_t = smf::rpc_recv_typed_context<Request>;
auto session_id = c.session();
return Get(req_t(std::move(c))).then(
[session_id](auto typed_env){
typed_env....mutate_session(session_id);
return make_ready_future<rpc_envelope>(
typed_env.serialize_data());
});
Method ID

AND ON TWO LINES
First and last name
Position, company
44
struct rpc_service {
virtual const char *service_name() const = 0;
virtual uint32_t service_id() const = 0;
virtual std::vector<rpc_service_method_handle> methods() = 0;
virtual ~rpc_service() {}
};

AND ON TWO LINES
First and last name
Position, company
telemetry

AND ON TWO LINES
First and last name
Position, company
smf built in telemetry
46
High Dynamic Range Histogram (HDR) … Expensive 185 KB
::hdr_init(1, // 1 microsec - minimum value
INT64_C(3600000000), // 1 hour in microsecs - max value
3, // Number of significant figures
&hist); // Pointer to initialize
// clients
client = ClientService::make_shared(std::move(opts));
client->enable_histogram_metrics();
// servers enabled by default

AND ON TWO LINES
First and last name
Position, company
smf built in telemetry (prometheus)
47

AND ON TWO LINES
First and last name
Position, company
performance?

AND ON TWO LINES
First and last name
Position, company
smf DPDK client - DPDK server*
49
7us p90 latency
26us p100
latency
8us p99 latency

AND ON TWO LINES
First and last name
Position, company
smf end-to-end latency (DPDK)
50
p100=500us
p90=51us
p99=56us
2 Threads. Includes connection open
time - cold cache

AND ON TWO LINES
First and last name
Position, company
smf end-to-end latency (DPDK)
51
Same graph, minus the **first** request of each of
the 2 threads
p100=151us
p50=51us p99=56us

AND ON TWO LINES
First and last name
Position, company
smf future work
52
● Currently could only do 1.5MM qps on the server setup
○ size = 60 byte payload + 20 TCP frame bytes
○ Hit TCP.hh bug in seastar with httpd/seawreck and my own impl
■ `(_snd.window > 0) || ((_snd.window == 0) && (len == 1))'
failed.
■ Could be my lab setup
■ Because of this - couldn't fill the wire fast enough
● Add JVM, Python, Go, codegen
● Improve Docs: https://senior7515.github.io/smf/

AND ON TWO LINES
First and last name
Position, company
THANK YOU
gallego.alexx@gmail.com | alexgallego.org
@emaxerrno
Please stay in touch
Any questions? https://senior7515.github.io/smf/

AND ON TWO LINES
First and last name
Position, company
extra

AND ON TWO LINES
First and last name
Position, company
smf Write Ahead Log (latency)
55
percentile Apache Kafka smf WAL speedup
p50 878ms 21ms 41X
p95 1340ms 36ms 37x
p99 1814ms 49ms 37x
p999 1896ms 54ms 35x
p100 1930ms 54ms 35x

AND ON TWO LINES
First and last name
Position, company
56

AND ON TWO LINES
First and last name
Position, company

Scylla Summit 2017: SMF: The Fastest RPC in the West

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Scylla Summit 2017: SMF: The Fastest RPC in the West

Similar to Scylla Summit 2017: SMF: The Fastest RPC in the West (20)

More from ScyllaDB

More from ScyllaDB (20)

Recently uploaded

Recently uploaded (20)

Scylla Summit 2017: SMF: The Fastest RPC in the West