SlideShare una empresa de Scribd logo
1 de 23
Kubernetes, Data Science and Machine Learning
Click to add text
Click to add text
Learn more at kublr.com/how-it-works/kublr-platform
Common ML Challenges and Approaches
Common ML challenges:
• Computer vision, natural language processing, speech
recognition, predictions, anomaly detection,
ML approaches:
• Supervised learning
• Classification, Regression
• Unsupervised learning
• Clustering
Typical ML Challenges
1. Data Source
2. Data preparation
3. Modelling
4. Model serving
5. Analysis
DB / File storage
Data Cleansing
Batches /Streaming
Data Transformation
Model Training
A/B Testing
Optimization
Inferencing
Results Exploration Interpretations
Why Using Kubernetes for ML?
Architecture – separation of concerns (dev, ops, infra), useful abstractions; universality
Pluggable and extensible – k8s is a set of open source microservices
Scalability and HA – autoscaling, resource management, self-healing
Container based – isolation, lightweight, few (if any) limitations on applications
Cloud and OS agnostic – Kubernetes + containers
Shared compute – RBAC, Limits, Quotas
On-demand – cloud support, autoscaling, reproducible applications
Frameworks – Great community
Kubernetes and Kublr for ML
• Infrastructure abstraction and scheduling
• DevOps and operational layer: monitoring and logging,
observability, HA
• Auto-scaling: HPA and cluster auto-scaler
• Kubernetes operators
• Storage (HDFS, Rook/Ceph)
• Custom resources and GPU
Kubernetes as an Orchestration Platform
Kubernetes
• Infrastructure abstraction
• Orchestration
• Network
• Configuration
• Service discovery
• Ingress
• Persistence
Master Node
K8s master components:
etcd, scheduler, api,
controller
K8s
metadata
Docker
kubelet
App data
K8s node components:
overlay network,
discovery, connectivity
Infrastructure and
application containers
Infrastructure and
application containers
Overlay
network
Kublr as Operations and DevOps Layer
K8S Clusters
PoC
Dev
Prod
Cloud
Data
center
API UI
Log collection
Operations
Monitoring
Authn and authz, SSO, fed
Audit Image Repo
Infrastructure management
Backup & DR
• Security
• Multiple environments
• Hybrid support
• Infrastructure
• Operations
• Monitoring and logs
• Backup and DR
• Container image
management
Horizontal Pod Autoscaler
• Cooldown/delay
• Rolling update
• Multiple metrics
• Custom metrics
Kubernetes
Deployment
HPA
Pod NPod 1
scale
...
metrics
Cluster Autoscaler
Kubernetes
Node group 1
Cluster
Autoscaler
Node NNode 1
scale
...
Resources
1. Requested by pods
2. Provided by nodes
• Multiple node groups
• AWS, Azure, GCE
• Cool-down period
• Scheduling rules
compliance
Node group M
Node NNode 1 ...
...
Master
Operators Kubernetes
K8S objects:
• Deployments
• Pods
• Namespaces
• Persistent Volumes
• Custom Resources
• ...
Operator
• Arbitrary software
• Operations automation
• Management automation
• Cloud native adaptation
• Custom Resources
• Annotations
• ...
Storage: HDFS and Hadoop
Hadoop/HDFS
• Scheduling tasks close to the data
• Reliable storage
• Established tool stack for data science and ML
Kubernetes
• Infrastructure management and recovery
• Underlying storage management
• Portability, hybrid support
Storage| Rook and Ceph
Rook = Ceph operator
Cloud native Ceph
Custom resources:
• Cluster
• Replica pool
• File system
Supported storage types:
• Block (rdb)
• Filesystem
• Object (S3, OpenStack
API)
[1] https://www.youtube.com/watch?v=iwVAvV_lI_Q
Kubernetes, GPU, and Kublr
• Standard Kubernetes resources: CPU, RAM, storage
• Custom resources – GPU, FPGA – via device plugins
• Nvidia GPU require driver and custom container runtime
• Kublr automates
• GPU driver installation
• Nvidia container runtime setup
• Nvidia device plugin setup
Major ML Stacks Compatible w/ Kubernetes
• Kubernetes TensorFlow TF-operator: github.com/kubeflow/kubeflow
• Spark 2.3.0
• In-house solution (model in Docker containers, run them on cloud or on-
prem Kubernetes )
• beam.apache.org
• Other rather “new” open source solutions
• Cloud and other vendor solutions
Kubeflow
• Simplify scaling and deploy machine learning applications
• Work on including different tooling
• Train/serve TensorFlow models in different environments
• Use Jupyter notebooks to manage TensorFlow training jobs
Spark without “Native " Kubernetes Support
A Spark standalone cluster in Kubernetes
Spark 2.3.0 bin/spark-submit 
--master k8s://https://<k8s-apiserver-
host>:<k8s-apiserver-port> 
--deploy-mode cluster 
--name spark-pi  --class
org.apache.spark.examples.SparkPi 
--conf spark.executor.instances=5 
--conf
spark.kubernetes.container.image=<spark-
image> 
local:///path/to/examples.jar
In-House Solution
• Custom Docker image with ML logic
• Kubernetes as scheduler
• Monitoring tools
• Custom implementation of distributed tasks scheduler or
framework (e.g. Celery)
Demo | ML and HPA with Custom Metrics
Demo
ML and HPA with Custom Metrics
To view the demo, check out our webinar on:
https://goo.gl/vY6HbE
kublr.com/demo
Vlad Penkin
Oleg Chunikhin
Arkadii Ocheretnoi
Thank you!

Más contenido relacionado

La actualidad más candente

Managing kubernetes deployment with operators
Managing kubernetes deployment with operatorsManaging kubernetes deployment with operators
Managing kubernetes deployment with operatorsCloud Technology Experts
 
Openstack days sv building highly available services using kubernetes (preso)
Openstack days sv   building highly available services using kubernetes (preso)Openstack days sv   building highly available services using kubernetes (preso)
Openstack days sv building highly available services using kubernetes (preso)Allan Naim
 
Setup Hybrid Clusters Using Kubernetes Federation
Setup Hybrid Clusters Using Kubernetes FederationSetup Hybrid Clusters Using Kubernetes Federation
Setup Hybrid Clusters Using Kubernetes Federationinwin stack
 
How to Run Kubernetes in Restrictive Environments
How to Run Kubernetes in Restrictive EnvironmentsHow to Run Kubernetes in Restrictive Environments
How to Run Kubernetes in Restrictive EnvironmentsKublr
 
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)Kublr
 
[Spark Summit 2017 NA] Apache Spark on Kubernetes
[Spark Summit 2017 NA] Apache Spark on Kubernetes[Spark Summit 2017 NA] Apache Spark on Kubernetes
[Spark Summit 2017 NA] Apache Spark on KubernetesTimothy Chen
 
Securing and Automating Kubernetes with Kyverno
Securing and Automating Kubernetes with KyvernoSecuring and Automating Kubernetes with Kyverno
Securing and Automating Kubernetes with KyvernoSaim Safder
 
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomRunning I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomCloud Native Day Tel Aviv
 
Multi-Clusters Made Easy with Liqo:
Getting Rid of Your Clusters Keeping Them...
Multi-Clusters Made Easy with Liqo:
Getting Rid of Your Clusters Keeping Them...Multi-Clusters Made Easy with Liqo:
Getting Rid of Your Clusters Keeping Them...
Multi-Clusters Made Easy with Liqo:
Getting Rid of Your Clusters Keeping Them...KCDItaly
 
Top Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at ScaleTop Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at ScaleSignalFx
 
Sf bay area Kubernetes meetup dec8 2016 - deployment models
Sf bay area Kubernetes meetup dec8 2016 - deployment modelsSf bay area Kubernetes meetup dec8 2016 - deployment models
Sf bay area Kubernetes meetup dec8 2016 - deployment modelsPeter Ss
 
Spinnaker on Kubernetes
Spinnaker on KubernetesSpinnaker on Kubernetes
Spinnaker on KubernetesJinwoong Kim
 
Devops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on TerraformDevops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on TerraformDrew Malone
 
Advanced Scheduling in Kubernetes
Advanced Scheduling in KubernetesAdvanced Scheduling in Kubernetes
Advanced Scheduling in KubernetesKublr
 
Tectonic Summit 2016: Multi-Cluster Kubernetes: Planning for Unknowns
Tectonic Summit 2016: Multi-Cluster Kubernetes: Planning for UnknownsTectonic Summit 2016: Multi-Cluster Kubernetes: Planning for Unknowns
Tectonic Summit 2016: Multi-Cluster Kubernetes: Planning for UnknownsCoreOS
 
Secure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CI
Secure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CISecure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CI
Secure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CIMitchell Pronschinske
 
Cost-effective Compute Clusters with Spot and Pre-emptible Instances - KubeCo...
Cost-effective Compute Clusters with Spot and Pre-emptible Instances - KubeCo...Cost-effective Compute Clusters with Spot and Pre-emptible Instances - KubeCo...
Cost-effective Compute Clusters with Spot and Pre-emptible Instances - KubeCo...Platform9
 
使用 Prometheus 監控 Kubernetes Cluster
使用 Prometheus 監控 Kubernetes Cluster 使用 Prometheus 監控 Kubernetes Cluster
使用 Prometheus 監控 Kubernetes Cluster inwin stack
 
OpenStack on Kubernetes (BOS Summit / May 2017 update)
OpenStack on Kubernetes (BOS Summit / May 2017 update)OpenStack on Kubernetes (BOS Summit / May 2017 update)
OpenStack on Kubernetes (BOS Summit / May 2017 update)rhirschfeld
 
Storage os kubernetes clusters need persistent data
Storage os   kubernetes clusters need persistent dataStorage os   kubernetes clusters need persistent data
Storage os kubernetes clusters need persistent dataLibbySchulze
 

La actualidad más candente (20)

Managing kubernetes deployment with operators
Managing kubernetes deployment with operatorsManaging kubernetes deployment with operators
Managing kubernetes deployment with operators
 
Openstack days sv building highly available services using kubernetes (preso)
Openstack days sv   building highly available services using kubernetes (preso)Openstack days sv   building highly available services using kubernetes (preso)
Openstack days sv building highly available services using kubernetes (preso)
 
Setup Hybrid Clusters Using Kubernetes Federation
Setup Hybrid Clusters Using Kubernetes FederationSetup Hybrid Clusters Using Kubernetes Federation
Setup Hybrid Clusters Using Kubernetes Federation
 
How to Run Kubernetes in Restrictive Environments
How to Run Kubernetes in Restrictive EnvironmentsHow to Run Kubernetes in Restrictive Environments
How to Run Kubernetes in Restrictive Environments
 
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
 
[Spark Summit 2017 NA] Apache Spark on Kubernetes
[Spark Summit 2017 NA] Apache Spark on Kubernetes[Spark Summit 2017 NA] Apache Spark on Kubernetes
[Spark Summit 2017 NA] Apache Spark on Kubernetes
 
Securing and Automating Kubernetes with Kyverno
Securing and Automating Kubernetes with KyvernoSecuring and Automating Kubernetes with Kyverno
Securing and Automating Kubernetes with Kyverno
 
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati ShalomRunning I/O intensive workloads on Kubernetes, by Nati Shalom
Running I/O intensive workloads on Kubernetes, by Nati Shalom
 
Multi-Clusters Made Easy with Liqo:
Getting Rid of Your Clusters Keeping Them...
Multi-Clusters Made Easy with Liqo:
Getting Rid of Your Clusters Keeping Them...Multi-Clusters Made Easy with Liqo:
Getting Rid of Your Clusters Keeping Them...
Multi-Clusters Made Easy with Liqo:
Getting Rid of Your Clusters Keeping Them...
 
Top Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at ScaleTop Considerations For Operating a Kubernetes Environment at Scale
Top Considerations For Operating a Kubernetes Environment at Scale
 
Sf bay area Kubernetes meetup dec8 2016 - deployment models
Sf bay area Kubernetes meetup dec8 2016 - deployment modelsSf bay area Kubernetes meetup dec8 2016 - deployment models
Sf bay area Kubernetes meetup dec8 2016 - deployment models
 
Spinnaker on Kubernetes
Spinnaker on KubernetesSpinnaker on Kubernetes
Spinnaker on Kubernetes
 
Devops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on TerraformDevops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
Devops Columbia October 2020 - Gabriel Alix: A Discussion on Terraform
 
Advanced Scheduling in Kubernetes
Advanced Scheduling in KubernetesAdvanced Scheduling in Kubernetes
Advanced Scheduling in Kubernetes
 
Tectonic Summit 2016: Multi-Cluster Kubernetes: Planning for Unknowns
Tectonic Summit 2016: Multi-Cluster Kubernetes: Planning for UnknownsTectonic Summit 2016: Multi-Cluster Kubernetes: Planning for Unknowns
Tectonic Summit 2016: Multi-Cluster Kubernetes: Planning for Unknowns
 
Secure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CI
Secure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CISecure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CI
Secure Infrastructure Provisioning with Terraform Cloud, Vault + GitLab CI
 
Cost-effective Compute Clusters with Spot and Pre-emptible Instances - KubeCo...
Cost-effective Compute Clusters with Spot and Pre-emptible Instances - KubeCo...Cost-effective Compute Clusters with Spot and Pre-emptible Instances - KubeCo...
Cost-effective Compute Clusters with Spot and Pre-emptible Instances - KubeCo...
 
使用 Prometheus 監控 Kubernetes Cluster
使用 Prometheus 監控 Kubernetes Cluster 使用 Prometheus 監控 Kubernetes Cluster
使用 Prometheus 監控 Kubernetes Cluster
 
OpenStack on Kubernetes (BOS Summit / May 2017 update)
OpenStack on Kubernetes (BOS Summit / May 2017 update)OpenStack on Kubernetes (BOS Summit / May 2017 update)
OpenStack on Kubernetes (BOS Summit / May 2017 update)
 
Storage os kubernetes clusters need persistent data
Storage os   kubernetes clusters need persistent dataStorage os   kubernetes clusters need persistent data
Storage os kubernetes clusters need persistent data
 

Similar a Kubernetes data science and machine learning

Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
Kubernetes stack reliability
Kubernetes stack reliabilityKubernetes stack reliability
Kubernetes stack reliabilityOleg Chunikhin
 
How Self-Healing Nodes and Infrastructure Management Impact Reliability
How Self-Healing Nodes and Infrastructure Management Impact ReliabilityHow Self-Healing Nodes and Infrastructure Management Impact Reliability
How Self-Healing Nodes and Infrastructure Management Impact ReliabilityKublr
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsDataWorks Summit/Hadoop Summit
 
Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsDataWorks Summit/Hadoop Summit
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsDataWorks Summit
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesDataWorks Summit
 
Spark volume requirements 2018
Spark volume requirements 2018Spark volume requirements 2018
Spark volume requirements 2018Rachit Arora
 
Fb talk arch_summit
Fb talk arch_summitFb talk arch_summit
Fb talk arch_summitdrewz lin
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarKamesh Pemmaraju
 
'Cloud-Native' Ecosystem - Aug 2015
'Cloud-Native' Ecosystem - Aug 2015'Cloud-Native' Ecosystem - Aug 2015
'Cloud-Native' Ecosystem - Aug 2015Lenny Pruss
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?DataWorks Summit
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesKamesh Pemmaraju
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Red_Hat_Storage
 
Storage as a service and OpenStack Cinder
Storage as a service and OpenStack CinderStorage as a service and OpenStack Cinder
Storage as a service and OpenStack Cinderopenstackindia
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Community
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsOleg Magazov
 
Lessons learned from running Spark on Docker
Lessons learned from running Spark on DockerLessons learned from running Spark on Docker
Lessons learned from running Spark on DockerDataWorks Summit
 
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014cdmaxime
 

Similar a Kubernetes data science and machine learning (20)

Big Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the ExpertsBig Data in the Cloud - The What, Why and How from the Experts
Big Data in the Cloud - The What, Why and How from the Experts
 
Kubernetes stack reliability
Kubernetes stack reliabilityKubernetes stack reliability
Kubernetes stack reliability
 
How Self-Healing Nodes and Infrastructure Management Impact Reliability
How Self-Healing Nodes and Infrastructure Management Impact ReliabilityHow Self-Healing Nodes and Infrastructure Management Impact Reliability
How Self-Healing Nodes and Infrastructure Management Impact Reliability
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
 
Hadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the ExpertsHadoop in the Cloud – The What, Why and How from the Experts
Hadoop in the Cloud – The What, Why and How from the Experts
 
Hadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the expertsHadoop in the cloud – The what, why and how from the experts
Hadoop in the cloud – The what, why and how from the experts
 
Storage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on KubernetesStorage Requirements and Options for Running Spark on Kubernetes
Storage Requirements and Options for Running Spark on Kubernetes
 
Spark volume requirements 2018
Spark volume requirements 2018Spark volume requirements 2018
Spark volume requirements 2018
 
Fb talk arch_summit
Fb talk arch_summitFb talk arch_summit
Fb talk arch_summit
 
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with CrowbarWicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
Wicked Easy Ceph Block Storage & OpenStack Deployment with Crowbar
 
'Cloud-Native' Ecosystem - Aug 2015
'Cloud-Native' Ecosystem - Aug 2015'Cloud-Native' Ecosystem - Aug 2015
'Cloud-Native' Ecosystem - Aug 2015
 
What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?What's the Hadoop-la about Kubernetes?
What's the Hadoop-la about Kubernetes?
 
New Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference ArchitecturesNew Ceph capabilities and Reference Architectures
New Ceph capabilities and Reference Architectures
 
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
Software Defined Storage, Big Data and Ceph - What Is all the Fuss About?
 
Storage as a service and OpenStack Cinder
Storage as a service and OpenStack CinderStorage as a service and OpenStack Cinder
Storage as a service and OpenStack Cinder
 
HDF Cloud Services
HDF Cloud ServicesHDF Cloud Services
HDF Cloud Services
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
Apache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and BasicsApache Cassandra training. Overview and Basics
Apache Cassandra training. Overview and Basics
 
Lessons learned from running Spark on Docker
Lessons learned from running Spark on DockerLessons learned from running Spark on Docker
Lessons learned from running Spark on Docker
 
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014Cloudera Impala - San Diego Big Data Meetup August 13th 2014
Cloudera Impala - San Diego Big Data Meetup August 13th 2014
 

Más de Kublr

Container Runtimes and Tooling, v2
Container Runtimes and Tooling, v2Container Runtimes and Tooling, v2
Container Runtimes and Tooling, v2Kublr
 
Container Runtimes and Tooling
Container Runtimes and ToolingContainer Runtimes and Tooling
Container Runtimes and ToolingKublr
 
Kubernetes in Hybrid Environments with Submariner
Kubernetes in Hybrid Environments with SubmarinerKubernetes in Hybrid Environments with Submariner
Kubernetes in Hybrid Environments with SubmarinerKublr
 
Intro into Rook and Ceph on Kubernetes
Intro into Rook and Ceph on KubernetesIntro into Rook and Ceph on Kubernetes
Intro into Rook and Ceph on KubernetesKublr
 
Hybrid architecture solutions with kubernetes and the cloud native stack
Hybrid architecture solutions with kubernetes and the cloud native stackHybrid architecture solutions with kubernetes and the cloud native stack
Hybrid architecture solutions with kubernetes and the cloud native stackKublr
 
Multi-cloud Kubernetes BCDR with Velero
Multi-cloud Kubernetes BCDR with VeleroMulti-cloud Kubernetes BCDR with Velero
Multi-cloud Kubernetes BCDR with VeleroKublr
 
Kubernetes Networking 101
Kubernetes Networking 101Kubernetes Networking 101
Kubernetes Networking 101Kublr
 
Kubernetes Ingress 101
Kubernetes Ingress 101Kubernetes Ingress 101
Kubernetes Ingress 101Kublr
 
Kubernetes persistence 101
Kubernetes persistence 101Kubernetes persistence 101
Kubernetes persistence 101Kublr
 
Portable CI/CD Environment as Code with Kubernetes, Kublr and Jenkins
Portable CI/CD Environment as Code with Kubernetes, Kublr and JenkinsPortable CI/CD Environment as Code with Kubernetes, Kublr and Jenkins
Portable CI/CD Environment as Code with Kubernetes, Kublr and JenkinsKublr
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101Kublr
 
Setting up CI/CD Pipeline with Kubernetes and Kublr step by-step
Setting up CI/CD Pipeline with Kubernetes and Kublr step by-stepSetting up CI/CD Pipeline with Kubernetes and Kublr step by-step
Setting up CI/CD Pipeline with Kubernetes and Kublr step by-stepKublr
 
Building Portable Applications with Kubernetes
Building Portable Applications with KubernetesBuilding Portable Applications with Kubernetes
Building Portable Applications with KubernetesKublr
 
Introduction to Kubernetes RBAC
Introduction to Kubernetes RBACIntroduction to Kubernetes RBAC
Introduction to Kubernetes RBACKublr
 
Canary Releases on Kubernetes w/ Spinnaker, Istio, and Prometheus
Canary Releases on Kubernetes w/ Spinnaker, Istio, and PrometheusCanary Releases on Kubernetes w/ Spinnaker, Istio, and Prometheus
Canary Releases on Kubernetes w/ Spinnaker, Istio, and PrometheusKublr
 

Más de Kublr (15)

Container Runtimes and Tooling, v2
Container Runtimes and Tooling, v2Container Runtimes and Tooling, v2
Container Runtimes and Tooling, v2
 
Container Runtimes and Tooling
Container Runtimes and ToolingContainer Runtimes and Tooling
Container Runtimes and Tooling
 
Kubernetes in Hybrid Environments with Submariner
Kubernetes in Hybrid Environments with SubmarinerKubernetes in Hybrid Environments with Submariner
Kubernetes in Hybrid Environments with Submariner
 
Intro into Rook and Ceph on Kubernetes
Intro into Rook and Ceph on KubernetesIntro into Rook and Ceph on Kubernetes
Intro into Rook and Ceph on Kubernetes
 
Hybrid architecture solutions with kubernetes and the cloud native stack
Hybrid architecture solutions with kubernetes and the cloud native stackHybrid architecture solutions with kubernetes and the cloud native stack
Hybrid architecture solutions with kubernetes and the cloud native stack
 
Multi-cloud Kubernetes BCDR with Velero
Multi-cloud Kubernetes BCDR with VeleroMulti-cloud Kubernetes BCDR with Velero
Multi-cloud Kubernetes BCDR with Velero
 
Kubernetes Networking 101
Kubernetes Networking 101Kubernetes Networking 101
Kubernetes Networking 101
 
Kubernetes Ingress 101
Kubernetes Ingress 101Kubernetes Ingress 101
Kubernetes Ingress 101
 
Kubernetes persistence 101
Kubernetes persistence 101Kubernetes persistence 101
Kubernetes persistence 101
 
Portable CI/CD Environment as Code with Kubernetes, Kublr and Jenkins
Portable CI/CD Environment as Code with Kubernetes, Kublr and JenkinsPortable CI/CD Environment as Code with Kubernetes, Kublr and Jenkins
Portable CI/CD Environment as Code with Kubernetes, Kublr and Jenkins
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
Setting up CI/CD Pipeline with Kubernetes and Kublr step by-step
Setting up CI/CD Pipeline with Kubernetes and Kublr step by-stepSetting up CI/CD Pipeline with Kubernetes and Kublr step by-step
Setting up CI/CD Pipeline with Kubernetes and Kublr step by-step
 
Building Portable Applications with Kubernetes
Building Portable Applications with KubernetesBuilding Portable Applications with Kubernetes
Building Portable Applications with Kubernetes
 
Introduction to Kubernetes RBAC
Introduction to Kubernetes RBACIntroduction to Kubernetes RBAC
Introduction to Kubernetes RBAC
 
Canary Releases on Kubernetes w/ Spinnaker, Istio, and Prometheus
Canary Releases on Kubernetes w/ Spinnaker, Istio, and PrometheusCanary Releases on Kubernetes w/ Spinnaker, Istio, and Prometheus
Canary Releases on Kubernetes w/ Spinnaker, Istio, and Prometheus
 

Último

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 

Último (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 

Kubernetes data science and machine learning

  • 1. Kubernetes, Data Science and Machine Learning
  • 2. Click to add text Click to add text Learn more at kublr.com/how-it-works/kublr-platform
  • 3. Common ML Challenges and Approaches Common ML challenges: • Computer vision, natural language processing, speech recognition, predictions, anomaly detection, ML approaches: • Supervised learning • Classification, Regression • Unsupervised learning • Clustering
  • 4. Typical ML Challenges 1. Data Source 2. Data preparation 3. Modelling 4. Model serving 5. Analysis DB / File storage Data Cleansing Batches /Streaming Data Transformation Model Training A/B Testing Optimization Inferencing Results Exploration Interpretations
  • 5. Why Using Kubernetes for ML? Architecture – separation of concerns (dev, ops, infra), useful abstractions; universality Pluggable and extensible – k8s is a set of open source microservices Scalability and HA – autoscaling, resource management, self-healing Container based – isolation, lightweight, few (if any) limitations on applications Cloud and OS agnostic – Kubernetes + containers Shared compute – RBAC, Limits, Quotas On-demand – cloud support, autoscaling, reproducible applications Frameworks – Great community
  • 6. Kubernetes and Kublr for ML • Infrastructure abstraction and scheduling • DevOps and operational layer: monitoring and logging, observability, HA • Auto-scaling: HPA and cluster auto-scaler • Kubernetes operators • Storage (HDFS, Rook/Ceph) • Custom resources and GPU
  • 7. Kubernetes as an Orchestration Platform Kubernetes • Infrastructure abstraction • Orchestration • Network • Configuration • Service discovery • Ingress • Persistence Master Node K8s master components: etcd, scheduler, api, controller K8s metadata Docker kubelet App data K8s node components: overlay network, discovery, connectivity Infrastructure and application containers Infrastructure and application containers Overlay network
  • 8. Kublr as Operations and DevOps Layer K8S Clusters PoC Dev Prod Cloud Data center API UI Log collection Operations Monitoring Authn and authz, SSO, fed Audit Image Repo Infrastructure management Backup & DR • Security • Multiple environments • Hybrid support • Infrastructure • Operations • Monitoring and logs • Backup and DR • Container image management
  • 9. Horizontal Pod Autoscaler • Cooldown/delay • Rolling update • Multiple metrics • Custom metrics Kubernetes Deployment HPA Pod NPod 1 scale ... metrics
  • 10. Cluster Autoscaler Kubernetes Node group 1 Cluster Autoscaler Node NNode 1 scale ... Resources 1. Requested by pods 2. Provided by nodes • Multiple node groups • AWS, Azure, GCE • Cool-down period • Scheduling rules compliance Node group M Node NNode 1 ... ... Master
  • 11. Operators Kubernetes K8S objects: • Deployments • Pods • Namespaces • Persistent Volumes • Custom Resources • ... Operator • Arbitrary software • Operations automation • Management automation • Cloud native adaptation • Custom Resources • Annotations • ...
  • 12. Storage: HDFS and Hadoop Hadoop/HDFS • Scheduling tasks close to the data • Reliable storage • Established tool stack for data science and ML Kubernetes • Infrastructure management and recovery • Underlying storage management • Portability, hybrid support
  • 13. Storage| Rook and Ceph Rook = Ceph operator Cloud native Ceph Custom resources: • Cluster • Replica pool • File system Supported storage types: • Block (rdb) • Filesystem • Object (S3, OpenStack API) [1] https://www.youtube.com/watch?v=iwVAvV_lI_Q
  • 14. Kubernetes, GPU, and Kublr • Standard Kubernetes resources: CPU, RAM, storage • Custom resources – GPU, FPGA – via device plugins • Nvidia GPU require driver and custom container runtime • Kublr automates • GPU driver installation • Nvidia container runtime setup • Nvidia device plugin setup
  • 15. Major ML Stacks Compatible w/ Kubernetes • Kubernetes TensorFlow TF-operator: github.com/kubeflow/kubeflow • Spark 2.3.0 • In-house solution (model in Docker containers, run them on cloud or on- prem Kubernetes ) • beam.apache.org • Other rather “new” open source solutions • Cloud and other vendor solutions
  • 16. Kubeflow • Simplify scaling and deploy machine learning applications • Work on including different tooling • Train/serve TensorFlow models in different environments • Use Jupyter notebooks to manage TensorFlow training jobs
  • 17. Spark without “Native " Kubernetes Support A Spark standalone cluster in Kubernetes
  • 18. Spark 2.3.0 bin/spark-submit --master k8s://https://<k8s-apiserver- host>:<k8s-apiserver-port> --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=5 --conf spark.kubernetes.container.image=<spark- image> local:///path/to/examples.jar
  • 19. In-House Solution • Custom Docker image with ML logic • Kubernetes as scheduler • Monitoring tools • Custom implementation of distributed tasks scheduler or framework (e.g. Celery)
  • 20. Demo | ML and HPA with Custom Metrics
  • 21. Demo ML and HPA with Custom Metrics To view the demo, check out our webinar on: https://goo.gl/vY6HbE
  • 23. Vlad Penkin Oleg Chunikhin Arkadii Ocheretnoi Thank you!

Notas del editor

  1. Oleg C