In this talk at AI Frontiers conference, Jeff Dean discusses recent trends and developments in deep learning research. Jeff touches on the significant progress that this research has produced in a number of areas, including computer vision, language understanding, translation, healthcare, and robotics. These advances are driven by both new algorithmic approaches to some of these problems, and by the ability to scale computation for training ever large models on larger datasets. Finally, one of the reasons for the rapid spread of the ideas and techniques of deep learning has been the availability of open source libraries such as TensorFlow. He gives an overview of why these software libraries have an important role in making the benefits of machine learning available throughout the world.
8. Growing Use of Deep Learning at Google
Android
Apps
drug discovery
Gmail
Image understanding
Maps
Natural language
understanding
Photos
Robotics research
Speech
Translation
YouTube
… many others ...
Across many
products/areas:
# of directories containing model description files
9. Google Brain Team
● Research: 27 papers in ICML, NIPS, and ICLR in 2016, plus
others in venues like ACL, ICASSP, CVPR, ICER, and OSDI;
Brain Residency Program, hosted ~50 interns in 2016, ...
● Product impact: Many dozens of high impact collaborations in
Google products/efforts like Search, Ads, Photos, Translate, GMail,
Maps, Cloud ML, speech recognition, self-driving cars, robotics, …
● Open-source tools for ML: TensorFlow, Magenta, visualization
tools, …
g.co/brain
11. What do you want in a machine learning system?
● Ease of expression: for lots of crazy ML ideas/algorithms
● Scalability: can run experiments quickly
● Portability: can run on wide variety of platforms
● Reproducibility: easy to share and reproduce research
● Production readiness: go from research to real products
12. Open, standard software for
general machine learning
Great for Deep Learning in
particular
First released Nov 2015
Apache 2.0 license
http://tensorflow.org/
and
https://github.com/tensorflow/tensorflow
14. Why Did We Build TensorFlow?
Wanted system that was flexible, scalable, and production-ready
DistBelief, our first system, was good on two of these, but lacked flexibility
Most existing open-source packages were also good on 2 of 3 but not all 3
15. TensorFlow Goals
Establish common platform for expressing machine learning ideas and systems
Make this platform the best in the world for both research and production use
Open source it so that it becomes a platform for everyone, not just Google
16. Facts and Figures
Launched on Nov. 9, 2015
Initial launch was reasonably fully-featured:
auto differentiation, queues, control flow, fairly comprehensive set of ops, …
tutorials made system accessible
out-of-the-box support for CPUs, GPUs, multiple devices, multiple platforms
17. Some Stats
500+ contributors, most of them outside Google
12,000+ commits since Nov, 2015 (~30 per day)
1M+ binary downloads
#15 most popular repository on GitHub by stars (just passed Linux!)
Used in ML classes at quite a few universities now:
Toronto, Berkeley, Stanford, …
Many companies/organizations using TensorFlow:
Google, DeepMind, OpenAI, Twitter, Snapchat, Airbus, Uber, ...
32. Experiment Turnaround Time and Research Productivity
● Minutes, Hours:
○ Interactive research! Instant gratification!
● 1-4 days
○ Tolerable
○ Interactivity replaced by running many experiments in parallel
● 1-4 weeks
○ High value experiments only
○ Progress stalls
● >1 month
○ Don’t even try
33. Just-In-Time Compilation
via TensorFlow’s XLA, "Accelerated Linear Algebra" compiler
0x00000000 movq (%rdx), %rax
0x00000003 vmovaps (%rax), %xmm0
0x00000007 vmulps %xmm0, %xmm0, %xmm0
0x0000000b vmovaps %xmm0, (%rdi)
...
TF graphs go in,
Optimized & specialized
assembly comes out.
Let's explain that!
35. What are some ways that
deep learning is having
a significant impact at Google?
All of these examples implemented using TensorFlow
or our predecessor system
36. “How cold is
it outside?”
Deep
Recurrent
Neural Network
Acoustic Input Text Output
Reduced word errors by more than 30%
Speech Recognition
Google Research Blog - August 2012, August 2015
39. Reuse same model for completely
different problems
Same basic model structure
trained on different data,
useful in completely different contexts
Example: given image → predict interesting pixels
40.
41.
42. We have tons of vision problems
Image search, StreetView, Satellite Imagery,
Translation, Robotics, Self-driving Cars,
www.google.com/sunroof
44. Performance on par or slightly better than the median of 8 U.S.
board-certified ophthalmologists (F-score of 0.95 vs. 0.91).
http://research.googleblog.com/2016/11/deep-learning-for-detection-of-diabetic.html
45. Computers can now see
Large implications for machine learning for robotics
46. Combining Vision with Robotics
“Deep Learning for Robots: Learning
from Large-Scale Interaction”, Google
Research Blog, March, 2016
“Learning Hand-Eye Coordination for
Robotic Grasping with Deep Learning
and Large-Scale Data Collection”,
Sergey Levine, Peter Pastor, Alex
Krizhevsky, & Deirdre Quillen,
Arxiv, arxiv.org/abs/1603.02199
48. Score for
doc,query
pair
Deep
Neural
Network
Query & document features
Query: “car parts for sale”,
Doc: “Rebuilt transmissions …”
Launched in 2015
Third most important search ranking signal (of 100s)
RankBrain in Google Search Ranking
Bloomberg, Oct 2015: “Google Turning Its Lucrative Web Search Over to AI Machines”
49. Sequence-to-Sequence Model
A B C
v
D __ X Y Z
X Y Z Q
Input sequence
Target sequence
[Sutskever & Vinyals & Le NIPS 2014]
Deep LSTM
50. Sequence-to-Sequence Model: Machine Translation
v
Input sentence
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How
Quelle est taille?votre <EOS>
51. Sequence-to-Sequence Model: Machine Translation
v
Input sentence
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How
Quelle est taille?votre <EOS>
tall
How
52. Sequence-to-Sequence Model: Machine Translation
v
Input sentence
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How tall are
Quelle est taille?votre <EOS> How tall
53. Sequence-to-Sequence Model: Machine Translation
v
Input sentence
Target sentence
[Sutskever & Vinyals & Le NIPS 2014] How tall you?are
Quelle est taille?votre <EOS> How aretall
54. Sequence-to-Sequence Model: Machine Translation
v
Input sentence
[Sutskever & Vinyals & Le NIPS 2014]
At inference time:
Beam search to choose most probable
over possible output sequences
Quelle est taille?votre <EOS>
55. April 1, 2009: April Fool’s Day joke
Nov 5, 2015: Launched Real Product
Feb 1, 2016: >10% of mobile Inbox replies
Smart Reply
60. Model: A close up of a child
holding a stuffed animal.
Human: A young girl asleep on
the sofa cuddling a stuffed
bear.
Model: A baby is asleep next to
a teddy bear.
Image Captions Research
66. More widespread use of:
Transfer and multi-task learning, zero-shot learning
67. Currently:
Most models are trained from scratch for a single task
This is quite inefficient:
Data inefficient: needs lots of data for each task
Computation inefficient: starting from scratch is a lot of work
Human ML expert inefficient: substantial effort required for
each task
68. Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation,
Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat,
Fernanda Viégas, Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean
https://arxiv.org/abs/1611.04558
https://research.googleblog.com/2016/11/zero-shot-translation-with-googles.html
71. Outrageously Large Neural Networks: The Sparsely-gated Mixture-of-Experts Layer,
Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le & Jeff Dean
Submitted to ICLR 2017, https://openreview.net/pdf?id=B1ckMDqlg
Per-Example Routing
73. Current:
Solution = ML expertise + data + computation
Can we turn this into:
Solution = data + 100X computation
???
74. Idea: model-generating model trained via RL
(1) Generate ten models
(2) Train them for a few hours
(3) Use loss of the generated models as reinforcement
learning signal
80. Tensor Processing Unit
Custom Google-designed chip for
neural net computations
In production use for >20 months: used on every
search query, for neural machine translation,
for AlphaGo match, ...
81. Example queries of the future
Which of these eye
images shows
symptoms of diabetic
retinopathy?
Please fetch me a cup
of tea from the kitchen
Describe this video
in Spanish
Find me documents related to
reinforcement learning for
robotics and summarize them
in German
82. Conclusions
Deep neural networks are making significant strides in
speech, vision, language, search, robotics, healthcare, …
If you’re not considering how to use deep neural nets to solve
some problems, you almost certainly should be