SlideShare una empresa de Scribd logo
1 de 61
Descargar para leer sin conexión
LESSONS
LEARNED
from
building practical
Deep Learning systems
Xavier Amatriain (@xamat)
A bit about myself...
A bit about myself...
● PhD on Audio and Music Signal Processing and Modeling
● Researcher in Recommender Systems for several years
● Led ML Research/Engineering at Netflix
● VP of Engineering at Quora
● Currently co-founder/CTO at Curai (Providing the world’s best healthcare to
everyone)
A bit about Curai...
What are we doing?
● Mission: Provide the world's
best healthcare for everyone
● Product: User-facing mobile
primary care app
● Team: Building an awesome
and diverse team
● Approach: State-of-the-art
AI/ML + product/UX/clinical
AI-based interaction
AI + Health coaches
AI + Doctors
Peer-reviewed research at Curai
Lessons learned...
More Data
or
Better Data?
Lesson 1
More data or better models?
Really?
More data or better models?
Sometimes,
it’s not about
more data
More data or better models?
Norvig:
“Google does not have
better Algorithms only
more Data”
Many
features/
low-bias
models
More data or better models?
Sometimes
you might not
need all your
“Big Data”
0 2 4 6 8 10 12 14 16 18 20
Number of Training Examples (in Millions)
TestingAccuracy
What about Deep Learning?
Year Breakthrough in AI Datasets (First Available) Algorithms (First Proposal)
1994 Human-level spontaneous speech recognition Spoken Wall Street Journal articles and other
texts (1991)
Hidden Markov Model (1984)
1997 IBM Deep Blue defeated Garry Kasparov 700,000 Grandmaster chess games, aka “The
Extended Book” (1991)
Negascout planning algorithm (1983)
2005 Google’s Arabic- and Chinese-to-English
translation
1,8 trillion tokens from Google Web and News
pages (collected in 2005)
Statistical machine translation algorithm (1988)
2011 IBM watson become the world Jeopardy!
Champion
8,6 million documents from Wikipedia,
Wiktionary, Wikiquote, and Project Gutenberg
(updated in 2005)
Mixture-of-Experts algorithm (1991)
2014 Google’s GoogLeNet object classification at
near-human performance
ImageNet corpus of 1,5 million labeled images
and 1,000 object catagories (2010)
Convolution neural network algorithm (1989)
2015 Google’s Deepmind achieved human parity in
playing 29 Atari games by learning general
control from video
Arcade Learning Environment dataset of over
50 Atari games (2013)
Q-learning algorithm (1992)
Average No. Of Years to Breakthrough 3 years 18 years
The average elapsed time between key algorithm proposals and corresponding advances was about 18 years,
whereas the average elapsed time between key dataset availabilities and corresponding advances was less
than 3 years, or about 6 times faster.
What about Deep Learning?
Models and
Recipes
Pretrained
Available models trained using OpenNMT
→ English → German
→ German → English
→ English Summarization
→ Multi-way – FR,ES,PT,IT,RO < > FR,ES,PT,IT,RO
More models coming soon:
→ Ubuntu Dialog Dataset
→ Syntactic Parsing
→ Image-to-Text
More data and better data
Simple
Models
>>
Complex
Models
Lesson 2
Football or Futbol?
Occam’s razor
Given two models that perform
more or less equally, you should
always prefer the less complex
Deep Learning might not be
preferred, even if it squeezes a
+1% in accuracy
Reasons to prefer a simpler model
Reasons to prefer a simpler model
….
There are many others
System complexity
Maintenance
Explainability
….
Figure 3: GoogLeNet network with all the bells and whistles
A real-life example
Goal: Supervised
Classification
→ 40 features
→ 10k examples
What did the ML
Engineer choose?
→ Multi-layer ANN trained
with Tensor Flow
What was his proposed
next step?
→ Try ConvNets
Where is the problem?
→ Hours to train, already
looking into distributing
→ There are much simpler
approaches
.... But,
sometimes
you do need
a Complex
Model
Lesson 3
Better models and features that “don’t work”
E.g. You have a linear model and have been
selecting and optimizing features for that model
→ More complex model with the same features -> improvement
not likely
→ More expressive features -> improvement not likely
More complex features may require a more
complex model
A more complex model may not show
improvements with a feature set that is too
simple
Yes, you
should care
about
Feature
Engineering
Lesson 4
Feature Engineering Example - Answer Ranking
How are those dimensions
translated into features?
Features that relate to the answer
Quality itself
Interaction features (upvotes/downvotes,
clicks, comments…)
User features (e.g. expertise in topic)
What is a good Quora answer?
Truthful Reusable Provides
explanation
Well
formatted ...
Feature Engineering
Properties of a well-
behaved ML feature Output
Mapping
from
features
OutputOutput
Most
complex
features
Mapping
from
features
Mapping
from
features
Output
Simplest
features
Features
Hand –
designed
features
Hand –
designed
program
InputInputInputInput
Rule -
based
systems
Classic
machine
learning
Representation
learning
Deep
learning
Fig; I. Goodfellow
Deep Learning:
Automating
Feature Discovery
Interpretable
Reliable
Reusable
Transformable
Deep Learning & Feature Engineering
Deep Learning & Feature Architecture Engineering
Supervised
vs.
Unsupervised
Learning
Lesson 5
Supervised/Unsupervised Learning
Unsupervised learning as
dimensionality reduction
E.g.1
Clustering + knn
E.g.2
Matrix Factorization
MF can be
interpreted as
Unsupervised:
• Dimensionality Reduction a la PCA
• Clustering (e.g. NMF)
Supervised:
• Labeled targets ~ regression
Unsupervised learning
as feature engineering
The “magic” behind combining unsupervised/supervised learning
Supervised/Unsupervised Learning
One of the “tricks” in
Deep Learning is how it
combines unsupervised
/supervised learning
→ E.g. Stacked Autoencoders
→ E.g. training of convolutional nets
X1
X2
X3
X4
X5
X6
+1
+1
+1
...
...
...
Input Features I Features II Softmax
classifier
P(y=0 | x)
P(y=1 | x)
P(y=2 | x)
Stacked
Autoencoders
Input
83x83
Layer 1
64x75x75
Layer 2
64@14x14
Layer 3
256@6x6
Layer 4
256@1x1
Output4
101
9x9
Convolution
(64 kernels)
10x10 pooling
5x5 subsampling
9x9
Convolution
(4096 kernels)
6x6 pooling
4x4 subsamp
→ Non-Linearity: half-wave rectification, shrinkage function, sigmoid
→ Pooling: average, L1, L2, max
→ Training: Supervised (1988-2006), Unsupervised+supervised (2006-now)
Convolutional Network (CovNet)
Neural Networks
Supervised
Unsupervised
Superviseed
Boost
ing
SVM
Decis
ion
Tree
Perc
eptro
n
AE D-AE
Neur
al
Net
RNN
Conv
. Net
RBM Spar
se
Codi
ng
DBN DBM
GMM Baye
s NP
ΣΠ
Supervised/Unsupervised Self-supervised Learning
Self-supervision
→ E.g. BERT and other LM
Everything is
an ensembleLesson 6
Ensembles
Netflix Prize was won by an ensemble
Most practical applications of ML run
an ensemble
→ Initially Bellkor was using GDBTs
→ BigChaos introduced ANN-based ensemble
→ Why wouldn’t you?
→ At least as good as the best of your methods
→ Can add completely different approaches (e.g. CF
and content-based)
→ You can use many different models at the ensemble
layer: LR, GDBTs, RFs, ANNs...
Ensembles & Feature Engineering
Ensembles are
the way to turn
any model into a
feature!
E.g. Don’t know if the
way to go is to use
Factorization Machines,
Tensor Factorization, or
RNNs?
→ Treat each model as a
“feature”
→ Feed them into an
ensemble
Sigmoid
Rectified
Linear Units
Output Units
Hidden Layers
Dense
Embeddings
Sparse
Features
Wide Models Deep Models Wide & Deep Models
There are
biases in your
data
Lesson 7
Defining training/testing data
Training a simple binary classifier for
good/bad answer
→ Defining positive and negative labels ->
Non-trivial task
→ Is this a positive or a negative?
→ funny uninformative answer with many
upvotes
→ short uninformative answer by a
well-known expert in the field
→ very long informative answer that nobody
reads/upvotes
→ informative answer with
grammar/spelling mistakes
→ ...
The curse of presentation bias
Better options
→ Correcting for the probability
a user will click on a position
-> Attention models
→ Explore/exploit approaches
such as MAB
Simply treating things you
show as negatives is not likely
to work
User can only click on what
you decide to show
→ But, what you decide to
show is the result of what
your model predicted is good
More
likely
to see
Less
likely
Bias & Fairness
Think about your
models
“in the wild”
Lesson 8
AI in the wild: Desired properties
● Easily extensible
○ Incrementally/iteratively learn from
“human-in-the-loop” or from
additional data
● Knows what it does not know
○ Models uncertainty in prediction
○ Enables fall-back to manual
Assisted diagnosis in the wild
1. Extensibility
a. Diagnosis as a ML task
i. Expert systems as a prior
b. Modeling less prevalent diseases
i. Low-shot learning
2. Knowing what you don’t know
b. Measures of uncertainty in
prediction
c. Allows fall-back to
“physician-in-the-loop”
Data and Models are great.
You know what’s even better?
The right
evaluation
approach!
Lesson 9
Offline/Online testing process
Offline Experimentation Online Experimentation
Initial
Hypothesis
Design AB
Test
Choose Control
Deploy Prototype
Observe Behavior
Analyze Results
Significant
Improvements?
Choose Model
Train Model
Test Offline
Hypothesis
Validated?
Try different
Model?
Reformulated
Hypothesis
Deploy
Feature
NO
YES
NO YES
NO
YES
Executing A/B tests
Overall Evaluation Criteria (OEC) =
e.g. member retention at Netflix
→ Use long-term metrics
whenever possible
→ Short-term metrics can be
informative and allow faster
decisions
⁻ But, not always aligned with
OEC
Measure differences
in metrics across
statistically identical
populations that
each experience a
different algorithm.
Decisions on the product always
data-driven
Offline testing
Measure model
performance, using (IR)
metrics
Offline performance =
indication to make decisions
on follow-up A/B tests
A critical (and mostly
unsolved) issue is how
offline metrics correlate with
A/B test results.
Do not
underestimate
the value of
systems and
frameworks
Lesson 10
ML vs Software
Can you treat your ML infrastructure as you
would your software one?
→ Yes and No
You should apply best Software Engineering
practices (e.g. encapsulation, abstraction,
cohesion, low coupling…)
However, Design Patterns for Machine Learning
software are not well known/documented
Software: the new frontier of ML?
Your AI
infrastructure
will have two
masters
Lesson 11
Machine Learning Infrastructure
→ Whenever you develop any ML infrastructure, you need to target two different modes:
Mode 1: ML experimentation
− Flexibility
− Easy-to-use
− Reusability
Mode 2: ML production
− All of the above + performance & scalability
→ Ideally you want the two modes to be as similar as possible
→ How to combine them?
Machine Learning Infrastructure
→ Favor experimentation and only invest in
productionizing once something shows
results
→ E.g. Have ML researchers use R and
then ask Engineers
to implement things in production when
they work
Option 1
→ Favor production and have “researchers”
struggle to figure out how to run
experiments
→ E.g. Implement highly optimized C++
code and have ML researchers
experiment only through data available
in logs/DB
Option 2
Machine Learning Infrastructure
→ Favor experimentation and only invest in
productionizing
once something shows results
→ E.g. Have ML researchers use R and
then ask Engineers
to implement things in production when
they work
Option 1
→ Favor production and have “researchers”
struggle to figure
out how to run experiments
→ E.g. Implement highly optimized C++
code and have ML researchers
experiment only through data available
in logs/DB
Option 2
Machine Learning Infrastructure
Good
intermediate
options
→ Have ML “researchers” experiment on Jupyter Notebooks using
Python tools (scikit-learn, Pytorch, TF…). Use same tools in
production whenever possible, implement optimized versions only
when needed.
→ Implement abstraction layers on top of optimized implementations
so they can be accessed from regular/friendly experimentation tools
There is ML
beyond Deep
Learning
Lesson 12
Other ML Advances
● Factorization Machines
● Tensor Methods
● Non-parametric Bayesian models
● XGBoost
● Online Learning
● Reinforcement Learning
● Learning to rank
● ...
Other very successful approaches
Sometimes DL does not win
Conclusions
01.
02.
03.
04.
05.
Choose the right metric
Be thoughtful about your data
Understand dependencies between data, models & systems
Optimize only what matters, beware of biases
Be thoughtful about : Your ML infrastructure/tools,
About organizing your teams
LESSONS
LEARNED
from
building practical
Deep Learning systems
Xavier Amatriain (@xamat)

Más contenido relacionado

La actualidad más candente

Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models BootcampData Science Dojo
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Krishnaram Kenthapadi
 
Snorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéSnorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéDatabricks
 
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInDataModel serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInDataGetInData
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021Steve Omohundro
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine LearningYuriy Guts
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEDatabricks
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxJesus Rodriguez
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1DianaGray10
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature EngineeringHJ van Veen
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsBenjamin Le
 
Deep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & FutureDeep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & FutureRouyun Pan
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfPremNaraindas1
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023HyunJoon Jung
 
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckAI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckSlideTeam
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfDavid Rostcheck
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveHuahai Yang
 

La actualidad más candente (20)

Large Language Models Bootcamp
Large Language Models BootcampLarge Language Models Bootcamp
Large Language Models Bootcamp
 
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (KDD 2019 Tutorial)
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
Snorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher RéSnorkel: Dark Data and Machine Learning with Christopher Ré
Snorkel: Dark Data and Machine Learning with Christopher Ré
 
Journey of Generative AI
Journey of Generative AIJourney of Generative AI
Journey of Generative AI
 
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInDataModel serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
Model serving made easy using Kedro pipelines - Mariusz Strzelecki, GetInData
 
The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021The Future of AI is Generative not Discriminative 5/26/2021
The Future of AI is Generative not Discriminative 5/26/2021
 
Automated Machine Learning
Automated Machine LearningAutomated Machine Learning
Automated Machine Learning
 
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIMEUnified Approach to Interpret Machine Learning Model: SHAP + LIME
Unified Approach to Interpret Machine Learning Model: SHAP + LIME
 
ChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptxChatGPT, Foundation Models and Web3.pptx
ChatGPT, Foundation Models and Web3.pptx
 
Intro to LLMs
Intro to LLMsIntro to LLMs
Intro to LLMs
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
Feature Engineering
Feature EngineeringFeature Engineering
Feature Engineering
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
Deep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & FutureDeep Learning Hardware: Past, Present, & Future
Deep Learning Hardware: Past, Present, & Future
 
Unlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdfUnlocking the Power of Generative AI An Executive's Guide.pdf
Unlocking the Power of Generative AI An Executive's Guide.pdf
 
Landscape of AI/ML in 2023
Landscape of AI/ML in 2023Landscape of AI/ML in 2023
Landscape of AI/ML in 2023
 
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete DeckAI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
AI Vs ML Vs DL PowerPoint Presentation Slide Templates Complete Deck
 
Large Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdfLarge Language Models - Chat AI.pdf
Large Language Models - Chat AI.pdf
 
Generative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's PerspectiveGenerative AI: Past, Present, and Future – A Practitioner's Perspective
Generative AI: Past, Present, and Future – A Practitioner's Perspective
 

Similar a Lessons learned from building practical deep learning systems

Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflowCharmi Chokshi
 
ML crash course
ML crash courseML crash course
ML crash coursemikaelhuss
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsXavier Amatriain
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldXavier Amatriain
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning CCG
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseFormulatedby
 
Data Science Salon Miami Presentation
Data Science Salon Miami PresentationData Science Salon Miami Presentation
Data Science Salon Miami PresentationGreg Werner
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018HJ van Veen
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Xavier Amatriain
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning台灣資料科學年會
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learningJohnson Ubah
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning SystemsXavier Amatriain
 
IntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdfIntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdfElio Laureano
 
IntroML_1_Introduction
IntroML_1_IntroductionIntroML_1_Introduction
IntroML_1_IntroductionElio Laureano
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AIGreg Werner
 
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...Wuhan University
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventBenjamin Schulte
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationAnkit Gupta
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basicsNeeleEilers
 

Similar a Lessons learned from building practical deep learning systems (20)

Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
ML crash course
ML crash courseML crash course
ML crash course
 
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systemsBIG2016- Lessons Learned from building real-life user-focused Big Data systems
BIG2016- Lessons Learned from building real-life user-focused Big Data systems
 
Staying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning WorldStaying Shallow & Lean in a Deep Learning World
Staying Shallow & Lean in a Deep Learning World
 
Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning Afternoons with Azure - Azure Machine Learning
Afternoons with Azure - Azure Machine Learning
 
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use CaseData Science Salon: Introduction to Machine Learning - Marketing Use Case
Data Science Salon: Introduction to Machine Learning - Marketing Use Case
 
Data Science Salon Miami Presentation
Data Science Salon Miami PresentationData Science Salon Miami Presentation
Data Science Salon Miami Presentation
 
Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018Hacking Predictive Modeling - RoadSec 2018
Hacking Predictive Modeling - RoadSec 2018
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
 
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
Recsys 2016 tutorial: Lessons learned from building real-life recommender sys...
 
林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning林守德/Practical Issues in Machine Learning
林守德/Practical Issues in Machine Learning
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
 
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
Strata 2016 -  Lessons Learned from building real-life Machine Learning SystemsStrata 2016 -  Lessons Learned from building real-life Machine Learning Systems
Strata 2016 - Lessons Learned from building real-life Machine Learning Systems
 
IntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdfIntroML_1_Introduction_Tagged.pdf
IntroML_1_Introduction_Tagged.pdf
 
IntroML_1_Introduction
IntroML_1_IntroductionIntroML_1_Introduction
IntroML_1_Introduction
 
Demystifying Ml, DL and AI
Demystifying Ml, DL and AIDemystifying Ml, DL and AI
Demystifying Ml, DL and AI
 
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...Muhammad Usman Akhtar  |  Ph.D Scholar  |  Wuhan  University  |  School of Co...
Muhammad Usman Akhtar | Ph.D Scholar | Wuhan University | School of Co...
 
Machine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup EventMachine Learning Product Managers Meetup Event
Machine Learning Product Managers Meetup Event
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
 

Más de Xavier Amatriain

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthXavier Amatriain
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19Xavier Amatriain
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateXavier Amatriain
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachXavier Amatriain
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneXavier Amatriain
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AIXavier Amatriain
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyXavier Amatriain
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicineXavier Amatriain
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In IndustryXavier Amatriain
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender SystemXavier Amatriain
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectiveXavier Amatriain
 
Machine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleMachine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleXavier Amatriain
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedXavier Amatriain
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConfXavier Amatriain
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systemsXavier Amatriain
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's KnowledgeXavier Amatriain
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraXavier Amatriain
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesXavier Amatriain
 

Más de Xavier Amatriain (20)

Data/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealthData/AI driven product development: from video streaming to telehealth
Data/AI driven product development: from video streaming to telehealth
 
AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19AI-driven product innovation: from Recommender Systems to COVID-19
AI-driven product innovation: from Recommender Systems to COVID-19
 
AI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 updateAI for COVID-19 - Q42020 update
AI for COVID-19 - Q42020 update
 
AI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approachAI for COVID-19: An online virtual care approach
AI for COVID-19: An online virtual care approach
 
AI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for EveryoneAI for healthcare: Scaling Access and Quality of Care for Everyone
AI for healthcare: Scaling Access and Quality of Care for Everyone
 
Towards online universal quality healthcare through AI
Towards online universal quality healthcare through AITowards online universal quality healthcare through AI
Towards online universal quality healthcare through AI
 
From one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategyFrom one to zero: Going smaller as a growth strategy
From one to zero: Going smaller as a growth strategy
 
Learning to speak medicine
Learning to speak medicineLearning to speak medicine
Learning to speak medicine
 
ML to cure the world
ML to cure the worldML to cure the world
ML to cure the world
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
 
Medical advice as a Recommender System
Medical advice as a Recommender SystemMedical advice as a Recommender System
Medical advice as a Recommender System
 
Past present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry PerspectivePast present and future of Recommender Systems: an Industry Perspective
Past present and future of Recommender Systems: an Industry Perspective
 
Machine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora ExampleMachine Learning for Q&A Sites: The Quora Example
Machine Learning for Q&A Sites: The Quora Example
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Barcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons LearnedBarcelona ML Meetup - Lessons Learned
Barcelona ML Meetup - Lessons Learned
 
10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf10 more lessons learned from building Machine Learning systems - MLConf
10 more lessons learned from building Machine Learning systems - MLConf
 
10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems10 more lessons learned from building Machine Learning systems
10 more lessons learned from building Machine Learning systems
 
Machine Learning to Grow the World's Knowledge
Machine Learning to Grow  the World's KnowledgeMachine Learning to Grow  the World's Knowledge
Machine Learning to Grow the World's Knowledge
 
MLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@QuoraMLConf Seattle 2015 - ML@Quora
MLConf Seattle 2015 - ML@Quora
 
Lean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven CompaniesLean DevOps - Lessons Learned from Innovation-driven Companies
Lean DevOps - Lessons Learned from Innovation-driven Companies
 

Último

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 

Último (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 

Lessons learned from building practical deep learning systems

  • 1. LESSONS LEARNED from building practical Deep Learning systems Xavier Amatriain (@xamat)
  • 2. A bit about myself...
  • 3. A bit about myself... ● PhD on Audio and Music Signal Processing and Modeling ● Researcher in Recommender Systems for several years ● Led ML Research/Engineering at Netflix ● VP of Engineering at Quora ● Currently co-founder/CTO at Curai (Providing the world’s best healthcare to everyone)
  • 4. A bit about Curai...
  • 5. What are we doing? ● Mission: Provide the world's best healthcare for everyone ● Product: User-facing mobile primary care app ● Team: Building an awesome and diverse team ● Approach: State-of-the-art AI/ML + product/UX/clinical AI-based interaction AI + Health coaches AI + Doctors
  • 9. More data or better models? Really?
  • 10. More data or better models? Sometimes, it’s not about more data
  • 11. More data or better models? Norvig: “Google does not have better Algorithms only more Data” Many features/ low-bias models
  • 12. More data or better models? Sometimes you might not need all your “Big Data” 0 2 4 6 8 10 12 14 16 18 20 Number of Training Examples (in Millions) TestingAccuracy
  • 13. What about Deep Learning? Year Breakthrough in AI Datasets (First Available) Algorithms (First Proposal) 1994 Human-level spontaneous speech recognition Spoken Wall Street Journal articles and other texts (1991) Hidden Markov Model (1984) 1997 IBM Deep Blue defeated Garry Kasparov 700,000 Grandmaster chess games, aka “The Extended Book” (1991) Negascout planning algorithm (1983) 2005 Google’s Arabic- and Chinese-to-English translation 1,8 trillion tokens from Google Web and News pages (collected in 2005) Statistical machine translation algorithm (1988) 2011 IBM watson become the world Jeopardy! Champion 8,6 million documents from Wikipedia, Wiktionary, Wikiquote, and Project Gutenberg (updated in 2005) Mixture-of-Experts algorithm (1991) 2014 Google’s GoogLeNet object classification at near-human performance ImageNet corpus of 1,5 million labeled images and 1,000 object catagories (2010) Convolution neural network algorithm (1989) 2015 Google’s Deepmind achieved human parity in playing 29 Atari games by learning general control from video Arcade Learning Environment dataset of over 50 Atari games (2013) Q-learning algorithm (1992) Average No. Of Years to Breakthrough 3 years 18 years The average elapsed time between key algorithm proposals and corresponding advances was about 18 years, whereas the average elapsed time between key dataset availabilities and corresponding advances was less than 3 years, or about 6 times faster.
  • 14. What about Deep Learning? Models and Recipes Pretrained Available models trained using OpenNMT → English → German → German → English → English Summarization → Multi-way – FR,ES,PT,IT,RO < > FR,ES,PT,IT,RO More models coming soon: → Ubuntu Dialog Dataset → Syntactic Parsing → Image-to-Text
  • 15. More data and better data
  • 18. Occam’s razor Given two models that perform more or less equally, you should always prefer the less complex Deep Learning might not be preferred, even if it squeezes a +1% in accuracy
  • 19. Reasons to prefer a simpler model
  • 20. Reasons to prefer a simpler model …. There are many others System complexity Maintenance Explainability …. Figure 3: GoogLeNet network with all the bells and whistles
  • 21. A real-life example Goal: Supervised Classification → 40 features → 10k examples What did the ML Engineer choose? → Multi-layer ANN trained with Tensor Flow What was his proposed next step? → Try ConvNets Where is the problem? → Hours to train, already looking into distributing → There are much simpler approaches
  • 22. .... But, sometimes you do need a Complex Model Lesson 3
  • 23. Better models and features that “don’t work” E.g. You have a linear model and have been selecting and optimizing features for that model → More complex model with the same features -> improvement not likely → More expressive features -> improvement not likely More complex features may require a more complex model A more complex model may not show improvements with a feature set that is too simple
  • 25. Feature Engineering Example - Answer Ranking How are those dimensions translated into features? Features that relate to the answer Quality itself Interaction features (upvotes/downvotes, clicks, comments…) User features (e.g. expertise in topic) What is a good Quora answer? Truthful Reusable Provides explanation Well formatted ...
  • 26. Feature Engineering Properties of a well- behaved ML feature Output Mapping from features OutputOutput Most complex features Mapping from features Mapping from features Output Simplest features Features Hand – designed features Hand – designed program InputInputInputInput Rule - based systems Classic machine learning Representation learning Deep learning Fig; I. Goodfellow Deep Learning: Automating Feature Discovery Interpretable Reliable Reusable Transformable
  • 27. Deep Learning & Feature Engineering
  • 28. Deep Learning & Feature Architecture Engineering
  • 30. Supervised/Unsupervised Learning Unsupervised learning as dimensionality reduction E.g.1 Clustering + knn E.g.2 Matrix Factorization MF can be interpreted as Unsupervised: • Dimensionality Reduction a la PCA • Clustering (e.g. NMF) Supervised: • Labeled targets ~ regression Unsupervised learning as feature engineering The “magic” behind combining unsupervised/supervised learning
  • 31. Supervised/Unsupervised Learning One of the “tricks” in Deep Learning is how it combines unsupervised /supervised learning → E.g. Stacked Autoencoders → E.g. training of convolutional nets X1 X2 X3 X4 X5 X6 +1 +1 +1 ... ... ... Input Features I Features II Softmax classifier P(y=0 | x) P(y=1 | x) P(y=2 | x) Stacked Autoencoders Input 83x83 Layer 1 64x75x75 Layer 2 64@14x14 Layer 3 256@6x6 Layer 4 256@1x1 Output4 101 9x9 Convolution (64 kernels) 10x10 pooling 5x5 subsampling 9x9 Convolution (4096 kernels) 6x6 pooling 4x4 subsamp → Non-Linearity: half-wave rectification, shrinkage function, sigmoid → Pooling: average, L1, L2, max → Training: Supervised (1988-2006), Unsupervised+supervised (2006-now) Convolutional Network (CovNet) Neural Networks Supervised Unsupervised Superviseed Boost ing SVM Decis ion Tree Perc eptro n AE D-AE Neur al Net RNN Conv . Net RBM Spar se Codi ng DBN DBM GMM Baye s NP ΣΠ
  • 34. Ensembles Netflix Prize was won by an ensemble Most practical applications of ML run an ensemble → Initially Bellkor was using GDBTs → BigChaos introduced ANN-based ensemble → Why wouldn’t you? → At least as good as the best of your methods → Can add completely different approaches (e.g. CF and content-based) → You can use many different models at the ensemble layer: LR, GDBTs, RFs, ANNs...
  • 35. Ensembles & Feature Engineering Ensembles are the way to turn any model into a feature! E.g. Don’t know if the way to go is to use Factorization Machines, Tensor Factorization, or RNNs? → Treat each model as a “feature” → Feed them into an ensemble Sigmoid Rectified Linear Units Output Units Hidden Layers Dense Embeddings Sparse Features Wide Models Deep Models Wide & Deep Models
  • 36. There are biases in your data Lesson 7
  • 37. Defining training/testing data Training a simple binary classifier for good/bad answer → Defining positive and negative labels -> Non-trivial task → Is this a positive or a negative? → funny uninformative answer with many upvotes → short uninformative answer by a well-known expert in the field → very long informative answer that nobody reads/upvotes → informative answer with grammar/spelling mistakes → ...
  • 38. The curse of presentation bias Better options → Correcting for the probability a user will click on a position -> Attention models → Explore/exploit approaches such as MAB Simply treating things you show as negatives is not likely to work User can only click on what you decide to show → But, what you decide to show is the result of what your model predicted is good More likely to see Less likely
  • 40. Think about your models “in the wild” Lesson 8
  • 41. AI in the wild: Desired properties ● Easily extensible ○ Incrementally/iteratively learn from “human-in-the-loop” or from additional data ● Knows what it does not know ○ Models uncertainty in prediction ○ Enables fall-back to manual
  • 42. Assisted diagnosis in the wild 1. Extensibility a. Diagnosis as a ML task i. Expert systems as a prior b. Modeling less prevalent diseases i. Low-shot learning 2. Knowing what you don’t know b. Measures of uncertainty in prediction c. Allows fall-back to “physician-in-the-loop”
  • 43. Data and Models are great. You know what’s even better? The right evaluation approach! Lesson 9
  • 44. Offline/Online testing process Offline Experimentation Online Experimentation Initial Hypothesis Design AB Test Choose Control Deploy Prototype Observe Behavior Analyze Results Significant Improvements? Choose Model Train Model Test Offline Hypothesis Validated? Try different Model? Reformulated Hypothesis Deploy Feature NO YES NO YES NO YES
  • 45. Executing A/B tests Overall Evaluation Criteria (OEC) = e.g. member retention at Netflix → Use long-term metrics whenever possible → Short-term metrics can be informative and allow faster decisions ⁻ But, not always aligned with OEC Measure differences in metrics across statistically identical populations that each experience a different algorithm. Decisions on the product always data-driven
  • 46. Offline testing Measure model performance, using (IR) metrics Offline performance = indication to make decisions on follow-up A/B tests A critical (and mostly unsolved) issue is how offline metrics correlate with A/B test results.
  • 47. Do not underestimate the value of systems and frameworks Lesson 10
  • 48. ML vs Software Can you treat your ML infrastructure as you would your software one? → Yes and No You should apply best Software Engineering practices (e.g. encapsulation, abstraction, cohesion, low coupling…) However, Design Patterns for Machine Learning software are not well known/documented
  • 49. Software: the new frontier of ML?
  • 50. Your AI infrastructure will have two masters Lesson 11
  • 51. Machine Learning Infrastructure → Whenever you develop any ML infrastructure, you need to target two different modes: Mode 1: ML experimentation − Flexibility − Easy-to-use − Reusability Mode 2: ML production − All of the above + performance & scalability → Ideally you want the two modes to be as similar as possible → How to combine them?
  • 52. Machine Learning Infrastructure → Favor experimentation and only invest in productionizing once something shows results → E.g. Have ML researchers use R and then ask Engineers to implement things in production when they work Option 1 → Favor production and have “researchers” struggle to figure out how to run experiments → E.g. Implement highly optimized C++ code and have ML researchers experiment only through data available in logs/DB Option 2
  • 53. Machine Learning Infrastructure → Favor experimentation and only invest in productionizing once something shows results → E.g. Have ML researchers use R and then ask Engineers to implement things in production when they work Option 1 → Favor production and have “researchers” struggle to figure out how to run experiments → E.g. Implement highly optimized C++ code and have ML researchers experiment only through data available in logs/DB Option 2
  • 54. Machine Learning Infrastructure Good intermediate options → Have ML “researchers” experiment on Jupyter Notebooks using Python tools (scikit-learn, Pytorch, TF…). Use same tools in production whenever possible, implement optimized versions only when needed. → Implement abstraction layers on top of optimized implementations so they can be accessed from regular/friendly experimentation tools
  • 55. There is ML beyond Deep Learning Lesson 12
  • 56. Other ML Advances ● Factorization Machines ● Tensor Methods ● Non-parametric Bayesian models ● XGBoost ● Online Learning ● Reinforcement Learning ● Learning to rank ● ...
  • 57. Other very successful approaches
  • 58. Sometimes DL does not win
  • 60. 01. 02. 03. 04. 05. Choose the right metric Be thoughtful about your data Understand dependencies between data, models & systems Optimize only what matters, beware of biases Be thoughtful about : Your ML infrastructure/tools, About organizing your teams
  • 61. LESSONS LEARNED from building practical Deep Learning systems Xavier Amatriain (@xamat)