SlideShare una empresa de Scribd logo
1 de 94
End-to-End ML Pipelines
TFX + KubeFlow + Airflow + MLflow + TPU
Chris Fregly
Founder @ .
Founder @ PipelineAI
Real-time Machine Learning and AI in Production
Former Databricks, Netflix
Apache Spark Contributor
O’Reilly Author
High Performance TensorFlow in Production
Meetup Organizer
Advanced Spark and TensorFlow Meetup
Who Am I? (@cfregly)
Advanced Spark and TensorFlow Meetup (Global, Monthly Events)
https://meetup.com/Advanced-Spark-and-TensorFlow-Meetup
Upcoming Full-Day Workshop on Saturday, November 2, 2019!
https://pipeline.ai @cfregly @PipelineAI
Next Workshop:
Nov 2, 2019
Next Workshop:
Nov 2, 2019
1 OK with Command Line?
2 OK with Python?
3 OK with Linear Algebra?
Who are you?
4 OK with Docker?
6
5 OK with Jupyter Notebook?
Recent Poll (July 2019)
4,000 Stars = $6,000,000 Seed
$1,500 per GitHub Star?!
(Please star our repo ASAP!!)
Recent Comment from Popular VC Investor in Silicon Valley
Community Edition
https://community.pipeline.ai
Note #1 of 10
IGNORE WARNINGS & ERRORS
Everything will be OK!
Note #2 of 10
THERE IS A LOT OF MATERIAL HERE
Many opportunities to explore on your own.
(Don’t upload sensitive data)
Note #3 of 10
YOU HAVE YOUR OWN INSTANCE
16 CPU, 104 GB RAM, 200GB SSD
Note #4 of 10
DATASETS
Chicago Taxi Dataset
(and various others)
Note #5 of 10
SOME NOTEBOOKS TAKE MINUTES
Please be patient.
(We are using large datasets)
Note #6 of 10
QUESTIONS?
Post questions to Zoom chat or Q&A.
(Antje and I will answer soon)
Antje >
Note #7 of 10
KUBEFLOW IS NOT A SILVER BULLET
There are still gaps in the pipeline.
(But gaps are getting smaller)
Note #8 of 10
THIS IS NOT CLOUD DEPENDENT*
*Except for 2 small exceptions…
Patches are underway.
Note #9 of 10
PRIMARILY TENSORFLOW 1.x
TF 2.x is not fully supported by TFX
(We have a section on TF 2)
Note #10 of 10
SHUTDOWN EACH NOTEBOOK AFTER
We are using complex browser voo-doo.
System 6
System 5System 4
Training
At Scale
System 3
System 1
Data
Ingestion
Data
Analysis
Data
Transform
Data
Validation
System 2
Build
Model
Model
Validation
Serving Logging
Monitoring
Roll-out
Data
Splitting
Ad-Hoc
Training
Why TFX and Why KubeFlow?
Improve Training/Serving
Consistency
Unify Disparate Systems
Manage Pipeline Complexity
Improve Portability
Wrangle Large Datasets
Improve Model Quality
Manage Versions
Composability
Distributed
Training
Configure
1 Setup Environment with Kubernetes
TensorFlow Extended (TFX)
ML Pipelines with Airflow and KubeFlow
Agenda
Hyper-Parameter Tuning with KubeFlow
Deploy Notebook with Kubernetes
2
3
4
5
Bonus Extras!
Keras Tuner11
12
A/B Tests13
Metrics and Monitoring
6 TPUs
MLflow
TensorFlow Privacy
7
8
9
10
Papermill
TensorFlow 2.0
Hands On
00_Explore_Environment
1.1 Kubernetes
TensorFlow Extended (TFX)
Airflow ML Pipelines
1.0 Environment Overview
KubeFlow ML Pipelines
6
Hyper-Parameter Tuning (Katib)
Prediction Traffic Router (Istio)
1.2
1.3
1.4
1.6
1.7
MLflow Pipelines1.5
1.1 Kubernetes
Kubernetes
NFS
Ceph
Cassandra
MySQL
Spark
Airflow
Tensorflow
Caffe
TF-Serving
Flask+Scikit
Operating system (Linux, Windows)
CPU Memory DiskSSD GPU FPGA ASIC NIC
Jupyter
GCP AWS Azure On-prem
Namespace
Quota Logging
Monitoring RBAC
Hands On
01_Explore_Kubernetes_Cluster
1.2 TensorFlow Extended (TFX)
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy
Reproduce
Training
1.3 Airflow ML Pipelines
1.4 KubeFlow ML Pipelines
1.5 MLflow Experiment Tracking
1.6 Hyper-Parameter Tuning (Katib)
1.7 Prediction Traffic Routing (Istio)
1 Setup Environment with Kubernetes
TensorFlow Extended (TFX)
ML Pipelines with Airflow and KubeFlow
Agenda
Hyper-Parameter Tuning with KubeFlow
Deploy Notebook with Kubernetes
2
3
4
5
2.1 TFX Internals
2.0 TFX Components
6
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
2.2 TFX Libraries
2.2 TFX Components
2.1 TFX Internals
Driver/Publisher
Moves data to/from Metadata Store
Executor
Runs the Actual Processing Code
Metadata Store
Artifact, execution, and lineage Info
Track inputs & outputs of all components
Stores training run including inputs & outputs
Analysis, validation, and versioning results
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
2.2 TFX Libraries
2.2.1
TFX Components Use These:
TensorFlow Data Validation (TFDV)
TensorFlow Transform (TFT)
TensorFlow Model Analysis (TFMA)
TensorFlow Metadata (TFMD) + ML Metadata (MLMD)
2.2.2
2.2.3
2.2.4
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
2.2.1 TFX Libraries - TFDV
TensorFlow Data Validation (TFDV)
Find Missing, Redundant & Important Features
Identify Features with Unusually-Large Scale
`infer_schema()` Generates Schema
Describe Feature Ranges
Detect Data Drift
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Uniformly
Distributed Data è
ç Non-Uniformly
Distributed Data
Hands On
02_TensorFlow_Data_Validation
(TFDV)
2.2.2 TFX Libraries - TFT
TensorFlow Transform (TFT)
Preprocess `tf.Example` data with TensorFlow
Useful for data that requires a full pass
Normalize all inputs by mean and std dev
Create vocabulary of strings è integers over all data
Bucketize features based on entire data distribution
Outputs a TensorFlow graph
Re-used across both training and serving
Uses Apache Beam (local mode) for Parallel Analysis
Can also use distributed mode
`preprocessing_fn(inputs)`: Primary Fn to Implement
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
import tensorflow as tf
import tensorflow_transform as tft
def preprocessing_fn(inputs):
x = inputs['x']
y = inputs['y']
s = inputs['s']
x_centered = x - tft.mean(x)
y_normalized = tft.scale_to_0_1(y)
s_integerized = tft.compute_and_apply_vocabulary(s)
x_centered_times_y_normalized = x_centered * y_normalized
return {
'x_centered': x_centered,
'y_normalized': y_normalized,
'x_centered_times_y_normalized':x_centered_times_y_normalized,
's_integerized': s_integerized
}
Hands On
03_TensorFlow_Transform
(TFT)
Hands On
03a_TensorFlow_Transform_Advanced
(TFT)
2.2.3 TFX Libraries - TFMA
TensorFlow Model Analysis (TFMA)
Analyze Model on Different Slices of Dataset
Track Metrics Over Time (“Next Day Eval”)
`EvalSavedModel` Contains Slicing Info
TFMA Pipeline: Read, Extract, Evaluate, Write
ie. Ensure Model Works Fairly Across All Users
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Hands On
04_TensorFlow_Model_Analysis
(TFMA)
2.2.4 TFX Libraries – Metadata
TensorFlow Metadata (TFMD)
ML Metadata (MLMD)
Record and Retrieve Experiment Metadata
Artifact, Execution, and Lineage Info
Track Inputs / Outputs of All TFX Components
Stores Training Run Info
Analysis and Validation Results
Model Versioning Info
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
2.3 TFX Components
ExampleGen
StatisticsGen
SchemaGen
ExampleValidator
Evaluator
Transform
ModelValidator
Trainer
Model Pusher2.3.92.3.1
2.3.2
2.3.3
2.3.4
2.3.5
2.3.6
2.3.7
2.3.8
Slack (!!)2.3.10
2.3.1 ExampleGen
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Load Training Data Into TFX Pipeline
Supports External Data Sources
Supports CSV and TFRecord Formats
Converts Data to tf.Example
Note: TFX Pipelines require tf.Example (?!)
Difficult to use non-TF models like XGBoost
from tfx.utils.dsl_utils import csv_input
from
tfx.components.example_gen.csv_example_gen.component
import CsvExampleGen
examples = csv_input(os.path.join(base_dir, 'data/simple'))
example_gen = CsvExampleGen(input_base=examples)
2.3.2 StatisticsGen
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Generates Statistics on Training Data
Global `mean` and `stddev` per input feature
Consumes tf.Example instances
from tfx import components
compute_eval_stats = components.StatisticsGen(
input_data=examples_gen.outputs.eval_examples,
name='compute-eval-stats'
)
2.3.3 SchemaGen
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Schema Needed by Some TFX Components
Data Types, Value Ranges, Optional, Required
Consumes Data from StatisticsGen
Schema used by TFDV, TFT, TFMA Libraries
Uses TFDV Library to infer schema
Best effort and basic
Human should verify
feature {
name: "age"
value_count {
min: 1
max: 1
}
type: FLOAT
presence {
min_fraction: 1
min_count: 1
}
}
from tfx import components
infer_schema = components.SchemaGen(
stats=compute_training_stats.outputs.output)
2.3.4 ExampleValidator
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Identifies Anomalies in Training Data
Used with serving data to detect drift / skew
Uses StatisticsGen and SchemaGen Outputs
Produces Validation Results
Uses TFDV Library for Input Validation
from tfx import components
infer_schema = components.SchemaGen(
stats=compute_training_stats.outputs.output
)
2.3.5 Transform
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Uses Data from ExampleGen & SchemaGen
Transformations Become Part of TF Graph (!!)
Helps Avoid Training/Serving Skew
Uses TFT Library for Transformations
Transformations Require Full Pass Thru Dataset
Global Reduction Across All Batches
Create Word Embeddings, Normalize, PCA
def preprocessing_fn(inputs):
# inputs: map from feature keys
# to raw not-yet-transformed features
# outputs: map from string feature key
# to transformed feature operations
2.3.6 Trainer
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Trains / Validates tf.Examples from Transform
Uses schema.proto from SchemaGen
Produces SavedModel and EvalSavedModel
Uses Core TensorFlow Python API
Works with TensorFlow 1.x Estimator API
TensorFlow 2.0 Keras Support Coming Soon
from tfx import components
trainer = components.Trainer(
module_file=taxi_pipeline_utils,
train_files=transform_training.outputs.output,
eval_files=transform_eval.outputs.output,
schema=infer_schema.outputs.output,
tf_transform_dir=transform_training.outputs.output,
train_steps=10000,
eval_steps=5000)
2.3.7 Evaluator
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Uses EvalSavedModel from Trainer
Writes Analysis Results to ML Metadata Store
Uses TFMA Library for Analysis
TFMA Uses Apache Beam to Scale Analysis
from tfx import components
import tensorflow_model_analysis as tfma
taxi_eval_spec = [
tfma.SingleSliceSpec(),
tfma.SingleSliceSpec(columns=['trip_start_hour'])
]
model_analyzer = components.Evaluator(
examples=examples_gen.outputs.eval_examples,
eval_spec=taxi_eval_spec,
model_exports=trainer.outputs.output)
2.3.8 ModelValidator
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Validate Models from Trainer
Uses Data from SchemaGen & StatisticsGen
Compares New Models to Baseline
Baseline == current model in production
New Model is Good if Meets/Exceeds Metrics
If Good, Notify Pusher to Deploy New Model
Simulate “Next Day Evaluation” On New Data
import tensorflow_model_analysis as tfma
taxi_mv_spec = [tfma.SingleSliceSpec()]
model_validator = components.ModelValidator(
examples=examples_gen.outputs.output,
model=trainer.outputs.output)
2.3.9 Model Pusher (Deployer)
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Push Good Model to Deployment Target
Uses Trained SavedModel
Writes Version Data to Metadata Store
Write to FileSystem or TensorFlow Hub
from tfx import components
pusher = components.Pusher(
model_export=trainer.outputs.output,
model_blessing=model_validator.outputs.blessing,
serving_model_dir=serving_model_dir)
2.3.10 Slack Component (!!)
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Runs After ModelValidator
Adds Human-in-the-Loop Step to Pipeline
TFX Sends Message to Slack with Model URI
Asks Human to Review the New Model
Respond ‘LGTM’, ‘approve’, ‘decline’, ‘reject’
Requires Slack API Setup / Integration
export SLACK_BOT_TOKEN={your_token}
_channel_id = 'my-channel-id'
_slack_token = os.environ['SLACK_BOT_TOKEN’]
slack_validator = SlackComponent(
model_export=trainer.outputs.output,
model_blessing=model_validator.outputs.blessing,
slack_token=_slack_token, channel_id=_channel_id,
timeout_sec=3600, )
https://github.com/tensorflow/tfx/tree/master
/tfx/examples/custom_components/slack/slack_component
1 Setup Environment with Kubernetes
TensorFlow Extended (TFX)
ML Pipelines with Airflow and KubeFlow
Agenda
Hyper-Parameter Tuning with KubeFlow
Deploy Notebook with Kubernetes
2
3
4
5
3.0 ML Pipelines with Airflow and KubeFlow
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy
3.1 Airflow
KubeFlow3.2
3.1 Airflow
6
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Most Widely-Used Workflow Orchestrator
Define Execution Graphs in Python
Decent UI
Good Community Support
Hands On
05_Airflow_ML_Pipelines
(Chicago Taxi Dataset)
Hands On
06_Airflow_Feature_Analysis
Hands On
07_Airflow_Model_Analysis
3.2 KubeFlow
6
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Pipelines
Based on Argo CI/CD Project from Intuit
TFJob & PyTorch Job
Supports Distributed Training
TensorFlow & PyTorch Jobs
KubeFlow Fairing Project (!!)
Run a notebook as a production job
Deploy training code with dependencies
Hands On
08_Simple_KubeFlow_ML_Pipeline
Hands On
09_Advanced_KubeFlow_ML_Pipeline
(Chicago Taxi Dataset)
Hands On
10_Distributed_TensorFlow_Job
Hands On
10a_Distributed_PyTorch_Job
1 Setup Environment with Kubernetes
TensorFlow Extended (TFX)
ML Pipelines with Airflow and KubeFlow
Agenda
Hyper-Parameter Tuning with KubeFlow
Deploy Notebook with Kubernetes
2
3
4
5
4.0 Hyper-Parameter Tuning
6
Experiment
Single Optimization Run
Single Objective Function Across Runs
Contains Many Trials
Trial
List of Param Values
Suggestion
Optimization Algorithm
Job
Evaluates a Trial
Calculates Objective
Hands On
11_Hyper_Parameter_Tuning
1 Setup Environment with Kubernetes
TensorFlow Extended (TFX)
ML Pipelines with Airflow and KubeFlow
Agenda
Hyper-Parameter Tuning with KubeFlow
Deploy Notebook with Kubernetes
2
3
4
5
5.0 Deploy Notebook as Job
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
6
5.1 Wrap Model in a Docker Image
Deploy Job to Kubernetes5.2
5.1 Create Docker Image
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
5.2 Deploy Notebook as Job
Feature
Load
Feature
Analyze Feature
Transform
Model
Train
Model
Evaluate
Model
Deploy Reproduce
Training
Hands On
12_Deploy_Notebook_Xgboost
Hands On
12a_Deploy_Notebook_TensorFlow
1 Setup Environment with Kubernetes
TensorFlow Extended (TFX)
ML Pipelines with TFX, Airflow, and KubeFlow
Agenda
Hyper-Parameter Tuning with TFX and KubeFlow
Deploy Notebook with Kubernetes
2
3
4
5
Bonus Extras!
Keras Tuner11
12
A/B Tests13
Metrics and Monitoring
6 TPUs
MLflow
TensorFlow Privacy
7
8
9
10
Papermill
TensorFlow 2.0
6.0 TPUs
Hands On
13_TPU_Keras_MNIST
7.0 MLflow
7.1 Experiment Tracking
Hyper-Parameter Tuning
Kubernetes-based Jobs
7.2
7.3
Hands On
14_MLflow_Scikit_Learn
Hands On
14a_MLflow_Keras
Hands On
14b_MLflow_TensorFlow
8.0 Papermill
Hands On
15_Papermill_Notebook_Job
9.0 TensorFlow Privacy (Differential Privacy)
Hands On
16_TF_Privacy
10.0 TensorFlow 2.0
11.0 Keras Tuner
Hands On
17_Keras_Tuner
12.0 Model Serving
Hands On
18_Simple_Serving_REST
Hands On
18a_AB_Test_REST
Bonus Extras!
Keras Tuner11
12
A/B Tests13
Metrics and Monitoring
6 TPUs
MLflow
TensorFlow Privacy
7
8
9
10
Papermill
TensorFlow 2.0
Thank you!
https://pipeline.ai @cfregly @PipelineAI
Next Workshop:
Nov 2, 2019
Next Workshop:
Nov 2, 2019

Más contenido relacionado

Más de Chris Fregly

AWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapAWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapChris Fregly
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...Chris Fregly
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Chris Fregly
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Chris Fregly
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Chris Fregly
 
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...Chris Fregly
 
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...Chris Fregly
 
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Chris Fregly
 
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...Chris Fregly
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Chris Fregly
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...Chris Fregly
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...Chris Fregly
 
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...Chris Fregly
 
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...Chris Fregly
 
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...Chris Fregly
 
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Chris Fregly
 
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...Chris Fregly
 
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Chris Fregly
 
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...Chris Fregly
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...Chris Fregly
 

Más de Chris Fregly (20)

AWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:CapAWS Re:Invent 2019 Re:Cap
AWS Re:Invent 2019 Re:Cap
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
 
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
 
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
 
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
 
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
 
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
 
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
 
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
Optimizing, Profiling, and Deploying TensorFlow AI Models with GPUs - San Fra...
 
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
Building Google's ML Engine from Scratch on AWS with GPUs, Kubernetes, Istio,...
 
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...Nvidia GPU Tech Conference -  Optimizing, Profiling, and Deploying TensorFlow...
Nvidia GPU Tech Conference - Optimizing, Profiling, and Deploying TensorFlow...
 
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
Building Google Cloud ML Engine From Scratch on AWS with PipelineAI - ODSC Lo...
 
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
Optimizing, Profiling, and Deploying TensorFlow AI Models in Production with ...
 
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
High Performance TensorFlow in Production -- Sydney ML / AI Train Workshop @ ...
 

Último

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 

Último (20)

"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTorch + XGBoost + Airflow + MLflow + Spark + Jupyter + TPU