Running large Kubernetes clusters is challenging. This talk focus on how you can optimize your network setup in clusters with 1000-2000 nodes. It discusses standard ingresses solutions and their drawbacks as well as potential solutions
2. Throughput
Trillions of data points daily
Scale
1000-2000 nodes clusters
Network challenges
Latency
End-to-end pipeline
Topology
Multiple clusters
Access from standard VMs
3. IPVS
Native Load-balancer
More efficient
Still a bit young
No Bridging
Route on the host
Specific CNI plugins
Better Latency
Less CPU usage
Native pod routing
Pod get standard IPs
No overlay overhead
Cross-cluster access
Much better ingresses
Addressing this in Kubernetes
8. L7-proxy ingress controller
data path
health checks
configuration
from watching ingresses/endpoints on apiservers (ingress-controller)
from watching LoadBalancer services (service-controller)
External
Client
Load-Balancer
l7proxy
l7proxy
kube-proxy
kube-proxy
kube-proxy
NP
NP
NP
Heathchecker
ingress-controller
pod
pod
pod
pod
Create l7proxy deployments
Update backends using service endpoints
Master
service-controller
9. With native pod routing
External
Client
Load-Balancer
pod
pod
pod
Heathchecker
data path
health checks
ingress-controller
configuration (from watching ingresses/endpoints on apiservers)
AWS: ALB with pod IP
GCP: GCLB with NEG
10. Limited to HTTP ingresses
No TCP/UDP traffic
Need to change LB controllers
(NLB / NEG support TCP)
Limited Loadbalancer support
No ELB
No ILB
Remaining challenges