SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
Waking the Data Scientist @ 2am:
Detect Model Degradation on Production Models
with Amazon SageMaker Endpoints & Model Monitor
Chris Fregly
Developer Advocate @ AWS
AI and Machine Learning https://datascienceonaws.com
github.com/data-science-on-aws
@cfregly
linkedin.com/in/cfregly
Who am I?
• Former Netflix, Databricks
• Organizer Advanced Kubeflow
Meetup (Globally)
• Co-Author @ Data Science on AWS (O’Reilly 2021)
Data Science on AWS – Book Outline
https://www.datascienceonaws.com/
Amazon SageMaker
A fully managed service that covers the entire Machine Learning workflow
Amazon SageMaker
re:Invent 2019 announcements
First fully integrated
development
environment (IDE) for
machine learning
Amazon SageMaker
Studio
Enhanced notebook experience
with quick-start &
easy collaboration
Amazon SageMaker Notebooks
Automatic debugging,
analysis, and alerting
Amazon SageMaker
Debugger
Experiment management system
to organize, track, & compare
thousands of experiments
Amazon SageMaker Experiments
monitoring to detect
deviation in quality & take
corrective actions
Amazon SageMaker
Model Monitor
Automatic generation
of ML models with
full visibility & control
Amazon SageMaker
Autopilot
Amazon SageMaker
focus of this session
First fully integrated
development
environment (IDE) for
machine learning
Amazon SageMaker
Studio
Enhanced notebook experience
with quick-start &
easy collaboration
Amazon SageMaker Notebooks
Automatic debugging,
analysis, and alerting
Amazon SageMaker
Debugger
Experiment management system
to organize, track, & compare
thousands of experiments
Amazon SageMaker Experiments
Model monitoring to detect
deviation in quality & take
corrective actions
Amazon SageMaker
Model Monitor
Automatic generation
of ML models with
full visibility & control
Amazon SageMaker
Autopilot
Amazon SageMaker Debugger
Debugging machine
learning training is
painful
Large neural networks
with many layers
Many connections
Computationally
intensive
Extraordinarily difficult
to inspect, debug, and
profile
the ‘black box’
+
+
=
Challenges with Machine Learning Training
Manually print debug
data
Manually analyze the debug
data
Use open source tools for
charting
Valuable data
scientist/ML practitioner
time wasted
+
+
=
Challenges with Machine Learning Training
Debugging machine
learning training is
painful
Example Issues While Training ML models
• Vanishing gradients
• Exploding gradients
• Loss not decreasing across steps
• Weight update ratios are either too small or too large
• Tensor values are all zeros
Debugging them is hard, even harder when running distributed training
All these issues impact on the learning process
An example: vanishing gradients
𝑥!
𝑥"
𝑥#
𝑤1
!,!
𝑤1
!,"
𝑤1
!,#
𝑤2
!,!
𝑤2
!,%
𝑤3
!,!
𝑤3
!,"
…
…
…
backpropagation
Weights update rule
𝑊!"# = 𝑊 − 𝜂 % ∇$ 𝐿
Intuition
Gradients vanish when they assume a very
small value à almost no weight update
during backpropagation
Why this happens? An example
𝜎 𝑧 =
1
1 + 𝑒!"
𝜕𝐿
𝜕𝜎
𝜕𝐿
𝜕𝑤!
=
𝜕𝐿
𝜕𝑜𝑢𝑡𝑝𝑢𝑡
∗
𝜕𝑜𝑢𝑡𝑝𝑢𝑡
𝜕ℎ𝑖𝑑𝑑𝑒𝑛2
∗
𝜕ℎ𝑖𝑑𝑑𝑒𝑛2
𝜕ℎ𝑖𝑑𝑑𝑒𝑛1
∗
𝜕ℎ𝑖𝑑𝑑𝑒𝑛1
𝜕𝑤!
can be small
input hidden1 hidden2 output
𝜕𝐿
𝜕𝑧
=
𝜕𝐿
𝜕𝜎
𝜕𝜎
𝜕𝑧
An Example: XGBoost- Loss Not Decreasing
• Overfitting is a problem with non-linear algorithms such as XGBoost
• By monitoring the performance of the loss over the last number of steps, training
can be completed early, by defining that the loss is not decreasing or not decreasing
at the expected rate.
• In this example training could be completed somewhere between 20 and 40 epochs
Automatic data
analysis
Relevant data
capture
Automatic error
detection
Faster training
Amazon SageMaker
Studio
integration
Debug data with no code
changes
Data is automatically
captured for analysis
Errors are automatically
detected and alerts are sent
Analyze and debug across
distributed clusters
Analyze & debug
from
Amazon SageMaker
Studio
Training data analysis, debugging, & alert generation
Introducing Amazon SageMaker Debugger
How does Amazon SageMaker Debugger Work?
Training in
progress
Analysis in
progress
Customer’s S3 Bucket
Amazon
CloudWatch Event
Amazon SageMaker
Amazon SageMaker
Studio Visualization
Amazon SageMaker
Notebook
Action à Stop the training
Action à Analyze using
Debugger SDK
Action à Visualize Tensors
using charts
• No code change is necessary to emit debug data with built in algorithms and custom training script
• Analysis occurs real time as data is emitted making real time alerts possible
Add Debugger to Training Job
Initialize your hook and
save tensors in specified
path
Initialize your rules.
These will read data for
analysis from the path
specified in the hook
Amazon SageMaker Model Monitor
Deploying a model is not the end.
You need to continuously monitor
models in production and iterate
Concept drift due to
divergence of data
Model performance can
change due to unknown
factors
Continuous monitoring involves a
lot of tooling and expense
Model monitoring is
cumbersome but critical
+
+
=
Introducing Amazon SageMaker Model Monitor
Automatic data
collection
Continuous
Monitoring
CloudWatch
Integration
Data collected from
endpoints is stored in
Amazon S3
Metrics emitted to
Amazon CloudWatch
make it easy to alarm
and automate corrective
actions
Continuous monitoring of models in production
Visual
Data analysis
Define a monitoring
schedule and detect
changes in quality against
a pre-defined baseline
See monitoring results,
data statistics, and
violation reports in
Amazon SageMaker
Studio; Analyze in
Notebooks
Flexibility
with rules
Use built-in rules to
detect data drift or write
your own rules for
custom analysis
How Does Model Monitor Work?
1. Create/ Update Amazon SageMaker Endpoint
Amazon SageMaker
Training job
Model Amazon SageMaker
Endpoint
Applications
2: Enable Data Collection for SageMaker Endpoint
Amazon SageMaker
Training job
Model Amazon SageMaker
Endpoint
Applications
Requests,
predictions
Enable data capture
s3://{destination-bucket-prefix}/{endpoint-name}/{variant-name}
/yyyy/mm/dd/hh/filename.jsonl
View data collected from endpoint
sagemaker/UC-DEMO-ModelMonitor/datacapture/UC-DEMO-xgb-churn-pred-model-
monitor-2019-12-01-21-09-29/AllTraffic/2019/12/01/21/28-45-917-ae917300-
350f-4482-ac73-4e838d9d6115.jsonl
sagemaker/UC-DEMO-ModelMonitor/datacapture/UC-DEMO-xgb-churn-pred-model-
monitor-2019-12-01-21-09-29/AllTraffic/2019/12/01/21/29-45-951-27c7035d-
87f8-45f9-9993-8008abc43aaa.jsonl
Example saved prediction request & response
3. Create baseline with train / validation dataset
Amazon SageMaker
Training job
Model Amazon SageMaker
Endpoint
Applications
Baseline statistics
and constraints
Requests,
predictions
Baseline
Processing Job
Create a baseline
Under the hood
1. Amazon Model Monitor runs a ProcessingJob on your behalf
• On-demand, distributed job
• Fully managed – ideal for data processing and custom analysis
• Pay for duration for which the job runs
2. Analyzes the data collected
• SageMaker provides pre-built container for analysis
• Pre-built container runs Deequ on Spark
• Custom analysis also supported
Baselining results - Statistics
baselining/results/statistics.json
Baselining results – suggested constraints
baselining/results/constraints.json
4. Create a monitoring schedule
Amazon SageMaker
Training job
Model Amazon SageMaker
Endpoint
Applications
Scheduled
Monitoring Job
Baseline statistics
and constraints
Requests,
predictions
Baseline
Processing Job
Schedule Monitoring Job
Under the hood
1. Amazon Model Monitor runs ProcessingJob on your behalf at the
schedule you select –i.e. Monitoring Jobs
2. Analyzes the data collected using your choice of analysis container
(pre-built or custom)
Compares results against the baseline
Generates results for each Monitoring job
• Violations report for each job in Amazon S3
• Statistics report for data collected during the run
• Emits summary metrics and statistics to Amazon CloudWatch
View Monitoring Jobs
5. View monitoring results
Amazon SageMaker
Training job
Model Amazon SageMaker
Endpoint
Applications
Scheduled
Monitoring Job
Results:
statistics
and violations
Baseline statistics
and constraints
Amazon
CloudWatch
metrics
Requests,
predictions
Baseline
Processing Job
6. Get alerted and take corrective actions
Amazon SageMaker
Training job
Model Amazon SageMaker
Endpoint
Applications
Scheduled
Monitoring Job
Results:
statistics
and violations
Baseline statistics
and constraints
Amazon
CloudWatch
metrics
Requests,
predictions
Baseline
Processing Job
Analysis of
results
Notifications
• Model updates
• Training data
updates
• Retraining
Take corrective actions
1. Set alarms in Amazon CloudWatch and triggers for retraining
SageMaker Model Monitor Summary
Amazon SageMaker
Training job
Model Amazon SageMaker
Endpoint
Applications
Scheduled
Monitoring Job
Results:
statistics
and violations
Baseline statistics
and constraints
Amazon
CloudWatch
metrics
Requests,
predictions
Baseline
Processing Job
Analysis of
results
Notifications
• Model updates
• Training data
updates
• Retraining
References
https://github.com/aws-samples/reinvent2019-aim362-sagemaker-debugger-model-monitor/
https://aws.amazon.com/blogs/machine-learning/detecting-and-analyzing-incorrect-model-
predictions-with-amazon-sagemaker-model-monitor-and-debugger/
Thank You!
Waking the Data Scientist @ 2am:
Detect Model Degradation on Production
Models with Amazon SageMaker Endpoints &
Model Monitor
Chris Fregly
Developer Advocate @ AWS
AI and Machine Learning https://datascienceonaws.com
github.com/data-science-on-aws
@cfregly
linkedin.com/in/cfregly

Más contenido relacionado

La actualidad más candente

Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기 - 윤석찬 (AWS 테크에반젤리스트)
Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기  - 윤석찬 (AWS 테크에반젤리스트)Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기  - 윤석찬 (AWS 테크에반젤리스트)
Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기 - 윤석찬 (AWS 테크에반젤리스트)Amazon Web Services Korea
 
Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingAmazon Web Services
 
Machine learning in the physical world by Kip Larson from AWS IoT
Machine learning in the physical world by  Kip Larson from AWS IoTMachine learning in the physical world by  Kip Larson from AWS IoT
Machine learning in the physical world by Kip Larson from AWS IoTBill Liu
 
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)Amazon Web Services Korea
 
Machine Learning - From Notebook to Production with Amazon Sagemaker
Machine Learning - From Notebook to Production with Amazon SagemakerMachine Learning - From Notebook to Production with Amazon Sagemaker
Machine Learning - From Notebook to Production with Amazon SagemakerAmazon Web Services
 
COST OPTIMIZATION – SAVE 20-30% OF YOUR AWS MONTHLY COST – A REALITY OR FANTASY?
COST OPTIMIZATION – SAVE 20-30% OF YOUR AWS MONTHLY COST – A REALITY OR FANTASY?COST OPTIMIZATION – SAVE 20-30% OF YOUR AWS MONTHLY COST – A REALITY OR FANTASY?
COST OPTIMIZATION – SAVE 20-30% OF YOUR AWS MONTHLY COST – A REALITY OR FANTASY?cloudzoneio
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotRandall Hunt
 
Deep learning at supercomputing scale by Rangan Sukumar from Cray
Deep learning at supercomputing scale  by Rangan Sukumar from CrayDeep learning at supercomputing scale  by Rangan Sukumar from Cray
Deep learning at supercomputing scale by Rangan Sukumar from CrayBill Liu
 
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013Amazon Web Services
 
Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018
Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018
Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018Amazon Web Services
 
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017Amazon Web Services
 
Aws Summit Berlin 2013 - Understanding database options on AWS
Aws Summit Berlin 2013 - Understanding database options on AWSAws Summit Berlin 2013 - Understanding database options on AWS
Aws Summit Berlin 2013 - Understanding database options on AWSAWS Germany
 
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)Amazon Web Services Korea
 
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Adrian Cockcroft
 
Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Amazon Web Services
 
Run Your CI/CD and Test Workloads for 90% Less with Amazon EC2 Spot Instances...
Run Your CI/CD and Test Workloads for 90% Less with Amazon EC2 Spot Instances...Run Your CI/CD and Test Workloads for 90% Less with Amazon EC2 Spot Instances...
Run Your CI/CD and Test Workloads for 90% Less with Amazon EC2 Spot Instances...Amazon Web Services
 

La actualidad más candente (20)

Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기 - 윤석찬 (AWS 테크에반젤리스트)
Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기  - 윤석찬 (AWS 테크에반젤리스트)Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기  - 윤석찬 (AWS 테크에반젤리스트)
Amazon SageMaker을 통한 손쉬운 Jupyter Notebook 활용하기 - 윤석찬 (AWS 테크에반젤리스트)
 
Agile in the Coud
Agile in the CoudAgile in the Coud
Agile in the Coud
 
Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated Computing
 
Machine learning in the physical world by Kip Larson from AWS IoT
Machine learning in the physical world by  Kip Larson from AWS IoTMachine learning in the physical world by  Kip Larson from AWS IoT
Machine learning in the physical world by Kip Larson from AWS IoT
 
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
AWS Batch를 통한 손쉬운 일괄 처리 작업 관리하기 - 윤석찬 (AWS 테크에반젤리스트)
 
Machine Learning - From Notebook to Production with Amazon Sagemaker
Machine Learning - From Notebook to Production with Amazon SagemakerMachine Learning - From Notebook to Production with Amazon Sagemaker
Machine Learning - From Notebook to Production with Amazon Sagemaker
 
COST OPTIMIZATION – SAVE 20-30% OF YOUR AWS MONTHLY COST – A REALITY OR FANTASY?
COST OPTIMIZATION – SAVE 20-30% OF YOUR AWS MONTHLY COST – A REALITY OR FANTASY?COST OPTIMIZATION – SAVE 20-30% OF YOUR AWS MONTHLY COST – A REALITY OR FANTASY?
COST OPTIMIZATION – SAVE 20-30% OF YOUR AWS MONTHLY COST – A REALITY OR FANTASY?
 
Deep Learning on ECS
Deep Learning on ECSDeep Learning on ECS
Deep Learning on ECS
 
Introduction to Amazon EC2 Spot
Introduction to Amazon EC2 SpotIntroduction to Amazon EC2 Spot
Introduction to Amazon EC2 Spot
 
WhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter BotWhereML a Serverless ML Powered Location Guessing Twitter Bot
WhereML a Serverless ML Powered Location Guessing Twitter Bot
 
Deep learning at supercomputing scale by Rangan Sukumar from Cray
Deep learning at supercomputing scale  by Rangan Sukumar from CrayDeep learning at supercomputing scale  by Rangan Sukumar from Cray
Deep learning at supercomputing scale by Rangan Sukumar from Cray
 
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
Automate Your Big Data Workflows (SVC201) | AWS re:Invent 2013
 
Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018
Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018
Accelerate Analytics at Scale with Amazon EMR - AWS Summit Sydney 2018
 
Where ml ai_heavy
Where ml ai_heavyWhere ml ai_heavy
Where ml ai_heavy
 
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017
Architectures for HPC/HTC Workloads on AWS - CMP306 - re:Invent 2017
 
Aws Summit Berlin 2013 - Understanding database options on AWS
Aws Summit Berlin 2013 - Understanding database options on AWSAws Summit Berlin 2013 - Understanding database options on AWS
Aws Summit Berlin 2013 - Understanding database options on AWS
 
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
AI 클라우드로 완전 정복하기 - 데이터 분석부터 딥러닝까지 (윤석찬, AWS테크에반젤리스트)
 
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
Flowcon (added to for CMG) Keynote talk on how Speed Wins and how Netflix is ...
 
Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319
 
Run Your CI/CD and Test Workloads for 90% Less with Amazon EC2 Spot Instances...
Run Your CI/CD and Test Workloads for 90% Less with Amazon EC2 Spot Instances...Run Your CI/CD and Test Workloads for 90% Less with Amazon EC2 Spot Instances...
Run Your CI/CD and Test Workloads for 90% Less with Amazon EC2 Spot Instances...
 

Similar a Waking the Data Scientist at 2am: Detect Model Degradation on Production Models with Amazon SageMaker Endpoints & Model Monitor

Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotCostruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotAmazon Web Services
 
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)Julien SIMON
 
Opticon18: Developer Night
Opticon18: Developer NightOpticon18: Developer Night
Opticon18: Developer NightOptimizely
 
AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondProvectus
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerProvectus
 
Testing Frameworks
Testing FrameworksTesting Frameworks
Testing FrameworksMoataz Nabil
 
Performance tuning Grails applications
 Performance tuning Grails applications Performance tuning Grails applications
Performance tuning Grails applicationsGR8Conf
 
Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Julien SIMON
 
Application Lifecycle Management with Visual Studio 2013
Application Lifecycle Management  with Visual Studio 2013Application Lifecycle Management  with Visual Studio 2013
Application Lifecycle Management with Visual Studio 2013Mahmoud Samara
 
End-to-End Machine Learning with Amazon SageMaker
End-to-End Machine Learning with Amazon SageMakerEnd-to-End Machine Learning with Amazon SageMaker
End-to-End Machine Learning with Amazon SageMakerSungmin Kim
 
Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusoneDotNetCampus
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATADotNetCampus
 
Small is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case DesignSmall is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case DesignGeorgina Tilby
 
Webinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testingWebinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testingFederico Toledo
 
Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#J On The Beach
 
How To Transform the Manual Testing Process to Incorporate Test Automation
How To Transform the Manual Testing Process to Incorporate Test AutomationHow To Transform the Manual Testing Process to Incorporate Test Automation
How To Transform the Manual Testing Process to Incorporate Test AutomationRanorex
 
Measuring Coverage From E2E Tests
Measuring Coverage From E2E TestsMeasuring Coverage From E2E Tests
Measuring Coverage From E2E TestsAnand Bagmar
 
Demystifying Machine Learning with AWS (ACD Mumbai)
Demystifying Machine Learning with AWS (ACD Mumbai)Demystifying Machine Learning with AWS (ACD Mumbai)
Demystifying Machine Learning with AWS (ACD Mumbai)AWS User Group Pune
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsScott Clark
 

Similar a Waking the Data Scientist at 2am: Detect Model Degradation on Production Models with Amazon SageMaker Endpoints & Model Monitor (20)

Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker AutopilotCostruisci modelli di Machine Learning con Amazon SageMaker Autopilot
Costruisci modelli di Machine Learning con Amazon SageMaker Autopilot
 
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
 
Opticon18: Developer Night
Opticon18: Developer NightOpticon18: Developer Night
Opticon18: Developer Night
 
AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and Beyond
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Test automation
Test automationTest automation
Test automation
 
Testing Frameworks
Testing FrameworksTesting Frameworks
Testing Frameworks
 
Performance tuning Grails applications
 Performance tuning Grails applications Performance tuning Grails applications
Performance tuning Grails applications
 
Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)
 
Application Lifecycle Management with Visual Studio 2013
Application Lifecycle Management  with Visual Studio 2013Application Lifecycle Management  with Visual Studio 2013
Application Lifecycle Management with Visual Studio 2013
 
End-to-End Machine Learning with Amazon SageMaker
End-to-End Machine Learning with Amazon SageMakerEnd-to-End Machine Learning with Amazon SageMaker
End-to-End Machine Learning with Amazon SageMaker
 
Net campus2015 antimomusone
Net campus2015 antimomusoneNet campus2015 antimomusone
Net campus2015 antimomusone
 
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATAPREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
PREDICT THE FUTURE , MACHINE LEARNING & BIG DATA
 
Small is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case DesignSmall is Beautiful- Fully Automate your Test Case Design
Small is Beautiful- Fully Automate your Test Case Design
 
Webinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testingWebinar: Estrategias para optimizar los costos de testing
Webinar: Estrategias para optimizar los costos de testing
 
Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#Agile experiments in Machine Learning with F#
Agile experiments in Machine Learning with F#
 
How To Transform the Manual Testing Process to Incorporate Test Automation
How To Transform the Manual Testing Process to Incorporate Test AutomationHow To Transform the Manual Testing Process to Incorporate Test Automation
How To Transform the Manual Testing Process to Incorporate Test Automation
 
Measuring Coverage From E2E Tests
Measuring Coverage From E2E TestsMeasuring Coverage From E2E Tests
Measuring Coverage From E2E Tests
 
Demystifying Machine Learning with AWS (ACD Mumbai)
Demystifying Machine Learning with AWS (ACD Mumbai)Demystifying Machine Learning with AWS (ACD Mumbai)
Demystifying Machine Learning with AWS (ACD Mumbai)
 
Using Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning ModelsUsing Bayesian Optimization to Tune Machine Learning Models
Using Bayesian Optimization to Tune Machine Learning Models
 

Más de Chris Fregly

AWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataAWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataChris Fregly
 
Pandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdfPandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdfChris Fregly
 
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS MeetupRay AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS MeetupChris Fregly
 
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedSmokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedChris Fregly
 
Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine LearningChris Fregly
 
Quantum Computing with Amazon Braket
Quantum Computing with Amazon BraketQuantum Computing with Amazon Braket
Quantum Computing with Amazon BraketChris Fregly
 
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-PersonChris Fregly
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...Chris Fregly
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Chris Fregly
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Chris Fregly
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Chris Fregly
 
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...Chris Fregly
 
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...Chris Fregly
 
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Chris Fregly
 
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...Chris Fregly
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Chris Fregly
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...Chris Fregly
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...Chris Fregly
 
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...Chris Fregly
 
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...Chris Fregly
 

Más de Chris Fregly (20)

AWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and DataAWS reInvent 2022 reCap AI/ML and Data
AWS reInvent 2022 reCap AI/ML and Data
 
Pandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdfPandas on AWS - Let me count the ways.pdf
Pandas on AWS - Let me count the ways.pdf
 
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS MeetupRay AI Runtime (AIR) on AWS - Data Science On AWS Meetup
Ray AI Runtime (AIR) on AWS - Data Science On AWS Meetup
 
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds UpdatedSmokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
Smokey and the Multi-Armed Bandit Featuring BERT Reynolds Updated
 
Amazon reInvent 2020 Recap: AI and Machine Learning
Amazon reInvent 2020 Recap:  AI and Machine LearningAmazon reInvent 2020 Recap:  AI and Machine Learning
Amazon reInvent 2020 Recap: AI and Machine Learning
 
Quantum Computing with Amazon Braket
Quantum Computing with Amazon BraketQuantum Computing with Amazon Braket
Quantum Computing with Amazon Braket
 
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
15 Tips to Scale a Large AI/ML Workshop - Both Online and In-Person
 
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
KubeFlow + GPU + Keras/TensorFlow 2.0 + TF Extended (TFX) + Kubernetes + PyTo...
 
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
Swift for TensorFlow - Tanmay Bakshi - Advanced Spark and TensorFlow Meetup -...
 
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
Hands-on Learning with KubeFlow + Keras/TensorFlow 2.0 + TF Extended (TFX) + ...
 
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
Spark SQL Catalyst Optimizer, Custom Expressions, UDFs - Advanced Spark and T...
 
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
 
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
PipelineAI Real-Time Machine Learning - Global Artificial Intelligence Confer...
 
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
Hyper-Parameter Tuning Across the Entire AI Pipeline GPU Tech Conference San ...
 
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
PipelineAI Optimizes Your Enterprise AI Pipeline from Distributed Training to...
 
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
Advanced Spark and TensorFlow Meetup - Dec 12 2017 - Dong Meng, MapR + Kubern...
 
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
High Performance Distributed TensorFlow in Production with GPUs - NIPS 2017 -...
 
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
PipelineAI + TensorFlow AI + Spark ML + Kuberenetes + Istio + AWS SageMaker +...
 
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
PipelineAI + AWS SageMaker + Distributed TensorFlow + AI Model Training and S...
 
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
High Performance TensorFlow in Production - Big Data Spain - Madrid - Nov 15 ...
 

Último

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfkalichargn70th171
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrandmasabamasaba
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...masabamasaba
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsBert Jan Schrijver
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...masabamasaba
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...Shane Coughlan
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park masabamasaba
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...masabamasaba
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisamasabamasaba
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplatePresentation.STUDIO
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburgmasabamasaba
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyviewmasabamasaba
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 

Último (20)

Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park %in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
%in Lydenburg+277-882-255-28 abortion pills for sale in Lydenburg
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

Waking the Data Scientist at 2am: Detect Model Degradation on Production Models with Amazon SageMaker Endpoints & Model Monitor

  • 1. Waking the Data Scientist @ 2am: Detect Model Degradation on Production Models with Amazon SageMaker Endpoints & Model Monitor Chris Fregly Developer Advocate @ AWS AI and Machine Learning https://datascienceonaws.com github.com/data-science-on-aws @cfregly linkedin.com/in/cfregly
  • 2. Who am I? • Former Netflix, Databricks • Organizer Advanced Kubeflow Meetup (Globally) • Co-Author @ Data Science on AWS (O’Reilly 2021)
  • 3. Data Science on AWS – Book Outline https://www.datascienceonaws.com/
  • 4. Amazon SageMaker A fully managed service that covers the entire Machine Learning workflow
  • 5. Amazon SageMaker re:Invent 2019 announcements First fully integrated development environment (IDE) for machine learning Amazon SageMaker Studio Enhanced notebook experience with quick-start & easy collaboration Amazon SageMaker Notebooks Automatic debugging, analysis, and alerting Amazon SageMaker Debugger Experiment management system to organize, track, & compare thousands of experiments Amazon SageMaker Experiments monitoring to detect deviation in quality & take corrective actions Amazon SageMaker Model Monitor Automatic generation of ML models with full visibility & control Amazon SageMaker Autopilot
  • 6. Amazon SageMaker focus of this session First fully integrated development environment (IDE) for machine learning Amazon SageMaker Studio Enhanced notebook experience with quick-start & easy collaboration Amazon SageMaker Notebooks Automatic debugging, analysis, and alerting Amazon SageMaker Debugger Experiment management system to organize, track, & compare thousands of experiments Amazon SageMaker Experiments Model monitoring to detect deviation in quality & take corrective actions Amazon SageMaker Model Monitor Automatic generation of ML models with full visibility & control Amazon SageMaker Autopilot
  • 8. Debugging machine learning training is painful Large neural networks with many layers Many connections Computationally intensive Extraordinarily difficult to inspect, debug, and profile the ‘black box’ + + = Challenges with Machine Learning Training
  • 9. Manually print debug data Manually analyze the debug data Use open source tools for charting Valuable data scientist/ML practitioner time wasted + + = Challenges with Machine Learning Training Debugging machine learning training is painful
  • 10. Example Issues While Training ML models • Vanishing gradients • Exploding gradients • Loss not decreasing across steps • Weight update ratios are either too small or too large • Tensor values are all zeros Debugging them is hard, even harder when running distributed training All these issues impact on the learning process
  • 11. An example: vanishing gradients 𝑥! 𝑥" 𝑥# 𝑤1 !,! 𝑤1 !," 𝑤1 !,# 𝑤2 !,! 𝑤2 !,% 𝑤3 !,! 𝑤3 !," … … … backpropagation Weights update rule 𝑊!"# = 𝑊 − 𝜂 % ∇$ 𝐿 Intuition Gradients vanish when they assume a very small value à almost no weight update during backpropagation Why this happens? An example 𝜎 𝑧 = 1 1 + 𝑒!" 𝜕𝐿 𝜕𝜎 𝜕𝐿 𝜕𝑤! = 𝜕𝐿 𝜕𝑜𝑢𝑡𝑝𝑢𝑡 ∗ 𝜕𝑜𝑢𝑡𝑝𝑢𝑡 𝜕ℎ𝑖𝑑𝑑𝑒𝑛2 ∗ 𝜕ℎ𝑖𝑑𝑑𝑒𝑛2 𝜕ℎ𝑖𝑑𝑑𝑒𝑛1 ∗ 𝜕ℎ𝑖𝑑𝑑𝑒𝑛1 𝜕𝑤! can be small input hidden1 hidden2 output 𝜕𝐿 𝜕𝑧 = 𝜕𝐿 𝜕𝜎 𝜕𝜎 𝜕𝑧
  • 12. An Example: XGBoost- Loss Not Decreasing • Overfitting is a problem with non-linear algorithms such as XGBoost • By monitoring the performance of the loss over the last number of steps, training can be completed early, by defining that the loss is not decreasing or not decreasing at the expected rate. • In this example training could be completed somewhere between 20 and 40 epochs
  • 13. Automatic data analysis Relevant data capture Automatic error detection Faster training Amazon SageMaker Studio integration Debug data with no code changes Data is automatically captured for analysis Errors are automatically detected and alerts are sent Analyze and debug across distributed clusters Analyze & debug from Amazon SageMaker Studio Training data analysis, debugging, & alert generation Introducing Amazon SageMaker Debugger
  • 14. How does Amazon SageMaker Debugger Work? Training in progress Analysis in progress Customer’s S3 Bucket Amazon CloudWatch Event Amazon SageMaker Amazon SageMaker Studio Visualization Amazon SageMaker Notebook Action à Stop the training Action à Analyze using Debugger SDK Action à Visualize Tensors using charts • No code change is necessary to emit debug data with built in algorithms and custom training script • Analysis occurs real time as data is emitted making real time alerts possible
  • 15. Add Debugger to Training Job Initialize your hook and save tensors in specified path Initialize your rules. These will read data for analysis from the path specified in the hook
  • 17. Deploying a model is not the end. You need to continuously monitor models in production and iterate Concept drift due to divergence of data Model performance can change due to unknown factors Continuous monitoring involves a lot of tooling and expense Model monitoring is cumbersome but critical + + =
  • 18. Introducing Amazon SageMaker Model Monitor Automatic data collection Continuous Monitoring CloudWatch Integration Data collected from endpoints is stored in Amazon S3 Metrics emitted to Amazon CloudWatch make it easy to alarm and automate corrective actions Continuous monitoring of models in production Visual Data analysis Define a monitoring schedule and detect changes in quality against a pre-defined baseline See monitoring results, data statistics, and violation reports in Amazon SageMaker Studio; Analyze in Notebooks Flexibility with rules Use built-in rules to detect data drift or write your own rules for custom analysis
  • 19. How Does Model Monitor Work?
  • 20. 1. Create/ Update Amazon SageMaker Endpoint Amazon SageMaker Training job Model Amazon SageMaker Endpoint Applications
  • 21. 2: Enable Data Collection for SageMaker Endpoint Amazon SageMaker Training job Model Amazon SageMaker Endpoint Applications Requests, predictions
  • 23. s3://{destination-bucket-prefix}/{endpoint-name}/{variant-name} /yyyy/mm/dd/hh/filename.jsonl View data collected from endpoint sagemaker/UC-DEMO-ModelMonitor/datacapture/UC-DEMO-xgb-churn-pred-model- monitor-2019-12-01-21-09-29/AllTraffic/2019/12/01/21/28-45-917-ae917300- 350f-4482-ac73-4e838d9d6115.jsonl sagemaker/UC-DEMO-ModelMonitor/datacapture/UC-DEMO-xgb-churn-pred-model- monitor-2019-12-01-21-09-29/AllTraffic/2019/12/01/21/29-45-951-27c7035d- 87f8-45f9-9993-8008abc43aaa.jsonl
  • 24. Example saved prediction request & response
  • 25. 3. Create baseline with train / validation dataset Amazon SageMaker Training job Model Amazon SageMaker Endpoint Applications Baseline statistics and constraints Requests, predictions Baseline Processing Job
  • 27. Under the hood 1. Amazon Model Monitor runs a ProcessingJob on your behalf • On-demand, distributed job • Fully managed – ideal for data processing and custom analysis • Pay for duration for which the job runs 2. Analyzes the data collected • SageMaker provides pre-built container for analysis • Pre-built container runs Deequ on Spark • Custom analysis also supported
  • 28. Baselining results - Statistics baselining/results/statistics.json
  • 29. Baselining results – suggested constraints baselining/results/constraints.json
  • 30. 4. Create a monitoring schedule Amazon SageMaker Training job Model Amazon SageMaker Endpoint Applications Scheduled Monitoring Job Baseline statistics and constraints Requests, predictions Baseline Processing Job
  • 32. Under the hood 1. Amazon Model Monitor runs ProcessingJob on your behalf at the schedule you select –i.e. Monitoring Jobs 2. Analyzes the data collected using your choice of analysis container (pre-built or custom) Compares results against the baseline Generates results for each Monitoring job • Violations report for each job in Amazon S3 • Statistics report for data collected during the run • Emits summary metrics and statistics to Amazon CloudWatch
  • 34. 5. View monitoring results Amazon SageMaker Training job Model Amazon SageMaker Endpoint Applications Scheduled Monitoring Job Results: statistics and violations Baseline statistics and constraints Amazon CloudWatch metrics Requests, predictions Baseline Processing Job
  • 35. 6. Get alerted and take corrective actions Amazon SageMaker Training job Model Amazon SageMaker Endpoint Applications Scheduled Monitoring Job Results: statistics and violations Baseline statistics and constraints Amazon CloudWatch metrics Requests, predictions Baseline Processing Job Analysis of results Notifications • Model updates • Training data updates • Retraining
  • 36. Take corrective actions 1. Set alarms in Amazon CloudWatch and triggers for retraining
  • 37. SageMaker Model Monitor Summary Amazon SageMaker Training job Model Amazon SageMaker Endpoint Applications Scheduled Monitoring Job Results: statistics and violations Baseline statistics and constraints Amazon CloudWatch metrics Requests, predictions Baseline Processing Job Analysis of results Notifications • Model updates • Training data updates • Retraining
  • 39. Thank You! Waking the Data Scientist @ 2am: Detect Model Degradation on Production Models with Amazon SageMaker Endpoints & Model Monitor Chris Fregly Developer Advocate @ AWS AI and Machine Learning https://datascienceonaws.com github.com/data-science-on-aws @cfregly linkedin.com/in/cfregly