8. Google‘s Neural Machine
Translation System: Bridging
the Gap between Human
and Machine Translation
[Wu et al., 2016]
https://japan.googleblog.com/2016/11/google.html
ニューラルネットを活用した機械翻訳は、数年ほど前から素晴らしい研
究成果を生み出しており、9 月には Google の研究者がこの手法の発
表を行いました。今回、採用した新しいシステムでは、文章をパーツご
とに翻訳するのではなく、ひとつの文として扱います。文のコンテキスト
を把握することで、より正確な訳語の候補を見つけることができるよう
になり、その後、言葉の順番を変え調整することで、文法により正しく、
人の言葉に近い翻訳が出来るようになります。ニューラルネットに基づ
く機械翻訳は、システム上にエンドツーエンドで学習し続けるシステム
を構築しています。お使いいただくことで、よりよい、より自然な翻訳が
出来るようになっていきます。
8
14. フレーズベースSMTのデコーディング
新たな 翻訳 手法 を 提案 する
new
novel
translation method
a method the
propose is
do
we propose
a
an approach
approach
suggestdecode
we proposetranslation algorithm
a new translation method
of the
novel translation
フレーズ
テーブル
we propose a novel translation method
14
52. A boy is playing a piano
Generating Video Description using Sequence-
to-sequence Model with Temporal Attention
[Laokulrat+, COLING2016]
52(Slide by Dr. Laokulrat)
64. 翻訳タスク参加チーム一覧
64
Team ID Organization
ASPEC JPC BPPT IITBC pivot
JE EJ JC CJ JE EJ JC CJ JK KJ EI IE HE EH HJ JH
NAIST Nara Institute of Science and Technology ✓
Kyoto-U Kyoto University ✓ ✓ ✓ ✓
TMU Tokyo Metropolitan University ✓
bjtu_nlp Beijing Jiaotong University ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
Sense Saarland University ✓ ✓
NICT-2
National Institute of Information
and Communication Technology
✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓
WASUIPS Waseda University ✓
EHR Ehara NLP Research Laboratory ✓ ✓ ✓ ✓ ✓ ✓
ntt
NTT Communication Science
Laboratories
✓
TOKYOMT Weblio, Inc. ✓
IITB-EN-
ID
Indian Institute of Technology Bombay ✓ ✓
JAPIO Japan Patent Information Organization ✓ ✓ ✓ ✓ ✓
IITP-MT Indian Institute of Technology Patna ✓
UT-KAY University of Tokyo ✓
UT-AKY University of Tokyo ✓
# of participants 5 7 3 6 2 3 2 6 0 2 2 2 0 2 1 0
outside Japancompany
NMT
74. IN
Hartree−Fock−Bogoliubov法を用いて変形した原子核の対密度分布を調
べた。
REF
Using Hartree-Fock-Bogoliubov method, pairing density distribution of the
deformed nuclei was investigated.
SMT
The pair density distribution of deformed nuclei was investigated using
Hartree-Fock-Bogoliubov method. (5, 5)
NMT
The density distribution of deformed nuclei is investigated by the Hartree -
Fock - Bogoliubov method. (4, 4)
IN 軸対称な四重極変形した原子核を対象にした。
REF Axial symmetric quadrupole deformed nuclei were examined.
SMT
The axisymmetric quadrupole deformation nuclei were made to be an object.
(5, 5)
NMT An axially symmetric quadrupole nucleus is studied. (4, 3)
論文 日→英 1
74
75. IN 食事については,全介助から半介助となった。
REF On the meal, changing from all support to half support was realized.
SMT The meal became half assistance from the assistance. (1, 4)
NMT On the meal, it became the assistance from all assistance assistance. (1, 2)
IN 排せつ,排尿はまだ時間を要する
REF It takes much more times in excretion and urination.
SMT The excretion, urination, still needs the time. (5, 5)
NMT Excretion and voidance are necessary for a long time. (1, 2)
IN スキー滑走中に転倒受傷し右膝変形ととう痛にて受診。
REF
A man tumbled over in a ski slide and was injured and then had right knee
deformation and pain and so received medical examination.
SMT
He had medical examination by pain and right knee deformation injured
overturning during ski. (3, 4)
NMT
The ski was injured in ski ski sliding, and it had medical examination by right
knee deformation and pain. (2, 2)
論文 日→英 2
75
76. IN
Using Hartree-Fock-Bogoliubov method, pairing density distribution of the
deformed nuclei was investigated.
REF
Hartree−Fock−Bogoliubov法を用いて変形した原子核の対密度分布を調
べた。
SMT
Hartree ‐ Fock ‐ Bogoliubov法を用いて,変形した核の対密度分布を調
べた。(5, 5)
NMT 標記の方法を用いて,変形核の対密度分布を調べた。 (4, 3)
IN
By this approach, monodispersed swollen styrene particles were obtained
and the fine particles in the title were synthesized.
REF この方法で,単分散なスチレン膨潤粒子を得,標記微粒子を合成した。
SMT
このアプローチにより,単分散膨潤スチレン粒子が得られ,標記微粒子を合
成した。(5, 5)
NMT この方法により,単分散のスチレン粒子を得,標記微粒子を合成した。 (4, 4)
IN No side effect was noted during treatment.
REF 治療中,副作用は認めなかった。
SMT 治療中に副作用はなかった。 (5, 5)
NMT 副作用は認めなかった。 (4, 3)
論文 英→日 1
76
77. IN
The liver and intestine were extracorporeally observed through the central
abdominal wall in the gestational age.
REF 在胎中,肝と腸が中心腹壁を介し体外的に観察された。
SMT 肝臓や小腸は妊娠期間における中枢腹壁を通して体外に観察された。 (5, 4)
NMT 妊娠時の腹部CTでは,肝,腸管はほぼ全周性に描出された。 (1, 1)
IN
Pilot plant tests showed the system gives clean water reusable as feed
water to deionizers.
REF
パイロットプラントによる結果は,純水装置の原水として再利用可能レベル迄
処理できた。
SMT
パイロットプラント試験はdeionizersに水として清浄水再使用を与えるシステ
ムを示した。 (3, 4)
NMT パイロットプラント試験により,本システムの有効性を確認した。 (2, 1)
IN
Polonium 210 is found, in an extremely low amount though, in the cigarette
smoke.
REF またポロニウム210はごく微量であるが,タバコの煙の中にも含まれている。
SMT たばこ煙が極めて低い量で210ポロニウムが見られる。 (3, 5)
NMT
シガレット210は極低濃度ではあるが,たばこ煙では検出されていない。 (1,
2)
論文 英→日 2
77
86. 意味的に近い語彙範囲内の単語に置換
86
Niantic developed PokémonGO for iOS and Android devices
Niantic は iOS 及び Android 端末 向け に ポケモンGO を 開発した
Google は 音楽 及び Android 端末 向け に ゲーム を 開発した
Google developed game for music and Android devices
入力文
置き換えた入力文
翻訳
翻訳文
後編集した翻訳文
[Li et al., 2016]
attentionミスや、置き換えた単語が
翻訳されないなどの問題がある
87. 単語ではなく文字を使う
• 入力のみ文字 [Costa-jussà and Fonollosa, 2016]
• 出力のみ文字 [Chung et al., 2016]
• 入力も出力も文字のみ [Lee et al., 2016]
– 上記3つは単語間の空白も1文字としており、
間接的に単語の情報を使っている
• 単語と文字のハイブリッド
– 単語単位のNMTがベース
– 入力の<UNK>は文字単位の
エンコーダーが表現を作る
– 出力の<UNK>は文字単位で翻訳
87[Luong and Manning, 2016]
89. Byte Pair Encodingのアルゴリズム
89
[Sennrich et al., 2016b]
頻度
5
2
6
3
単語
l o w
l o w e r
n e w e s t
w i d e s t
コーパス
l, o, w, e, r, n, w, s, t, i, d
語彙 (サイズ = 15)
初期語彙 = 文字 (11個)
es (頻度 = 9)
est (頻度 = 9)
lo (頻度 = 7)
low (頻度 = 7)
単語
l o w
l o w e r
n e w es t
w i d es t
単語
l o w
l o w e r
n e w est
w i d est
単語
lo w
lo w e r
n e w est
w i d est
単語
low
low e r
n e w est
w i d est
135. 135
Philip Arthur, Graham Neubig, and Satoshi Nakamura. 2016. Incorporating discrete translation lexicons into neural machine
translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pages 1557–1567,
Austin, Texas, November. Association for Computational Linguistics.
Michael Auli and Jianfeng Gao. 2014. Decoder integration and expected bleu training for recurrent neural network language
models. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 2: Short
Papers), pages 136–142. Association for Computational Linguistics, June.
Michael Auli, Michel Galley, Chris Quirk, and Geoffrey Zweig. 2013. Joint language and translation modeling with recurrent
neural networks. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, pages
1044–1054. Association for Computational Linguistics.
Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. 2014. Neural machine translation by jointly learning to align and
translate. CoRR, abs/1409.0473.
Ozan Caglayan, Lo ̈ıc Barrault, and Fethi Bougares. 2016. Multimodal attention for neural machine translation. CoRR,
abs/1609.03976.
Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua
Bengio. 2014. Learning phrase representations using rnn encoder–decoder for statistical machine translation. In
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1724–1734,
Doha, Qatar, October. Association for Computational Linguistics.
Sumit Chopra, Michael Auli, and Alexander M. Rush. 2016. Abstractive sentence summarization with attentive recurrent
neural networks. In Proceedings of the 2016 Conference of the North American Chapter of the As- sociation for
Computational Linguistics: Human Language Technologies, pages 93–98, San Diego, California, June. Association for
Computational Linguistics.
Chenhui Chu, Raj Dabre, and Sadao Kurohashi. 2017. An empirical comparison of simple domain adaptation methods for
neural machine translation. CoRR, abs/1701.03214.
Junyoung Chung, Kyunghyun Cho, and Yoshua Bengio. 2016. A character-level decoder without explicit seg- mentation for
neural machine translation. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics
(Volume 1: Long Papers), pages 1693–1703. Association for Computational Lin- guistics.
Marta R. Costa-jussa` and Jose ́ A. R. Fonollosa. 2016. Character-based neural machine translation. In Proceedings of the
54th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), pages 357–361, Berlin,
Germany, August. Association for Computational Linguistics.
136. 136
Josep Maria Crego, Jungi Kim, Guillaume Klein, Anabel Rebollo, Kathy Yang, Jean Senellart, Egor Akhanov, Patrice Brunelle,
Aurelien Coquard, Yongchao Deng, Satoshi Enoue, Chiyo Geiss, Joshua Johanson, Ardas Khalsa, Raoum Khiari, Byeongil Ko,
Catherine Kobus, Jean Lorieux, Leidiana Martins, Dang-Chuan Nguyen, Alexandra Priori, Thomas Riccardi, Natalia Segal,
Christophe Servan, Cyril Tiquet, Bo Wang, Jin Yang, Dakun Zhang, Jing Zhou, and Peter Zoldan. 2016. Systran’s pure neural
machine translation systems. CoRR, abs/1610.05540.
Fabien Cromieres, Chenhui Chu, Toshiaki Nakazawa, and Sadao Kurohashi, 2016. Proceedings of the 3rd Work- shop on
Asian Translation (WAT2016), chapter Kyoto University Participation to WAT 2016, pages 166–174. Workshop on Asian
Translation.Jeffrey Dean, Greg Corrado, Rajat Monga, Kai Chen, Matthieu Devin, Mark Mao, Marc’aurelio Ranzato, Andrew
Senior, Paul Tucker, Ke Yang, Quoc V. Le, and Andrew Y. Ng. 2012. Large scale distributed deep networks. In F. Pereira, C. J. C.
Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 1223–1231.
Curran Associates, Inc.
Jacob Devlin, Rabih Zbib, Zhongqiang Huang, Thomas Lamar, Richard Schwartz, and John Makhoul. 2014. Fast and robust
neural network joint models for statistical machine translation. In Proceedings of the 52nd Annual Meeting of the
Association for Computational Linguistics (Volume 1: Long Papers), pages 1370–1380. Association for Computational
Linguistics.
Li Dong and Mirella Lapata. 2016. Language to logical form with neural attention. CoRR, abs/1601.01280.
Daxiang Dong, Hua Wu, Wei He, Dianhai Yu, and Haifeng Wang. 2015. Multi-task learning for multiple language translation.
In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint
Conference on Natural Language Processing (Volume 1: Long Papers), pages 1723– 1732. Association for Computational
Linguistics.
Orhan Firat, Kyunghyun Cho, and Yoshua Bengio. 2016a. Multi-way, multilingual neural machine translation with a shared
attention mechanism. In Proceedings of the 2016 Conference of the North American Chapter of the Association for
Computational Linguistics: Human Language Technologies, pages 866–875. Association for Computational Linguistics.
Orhan Firat, Baskaran Sankaran, Yaser Al-Onaizan, T. Fatos Yarman Vural, and Kyunghyun Cho. 2016b. Zero- resource
translation with multi-lingual neural machine translation. In Proceedings of the 2016 Conference on Empirical Methods in
Natural Language Processing, pages 268–277. Association for Computational Linguis- tics.
Jianfeng Gao, Xiaodong He, Wen tau Yih, and Li Deng. 2014. Learning continuous phrase representations for translation
modeling. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long
Papers), pages 699–709. Association for Computational Linguistics, June.
137. 137
Jonas Gehring, Michael Auli, David Grangier, and Yann N. Dauphin. 2016. A convolutional encoder model for neural machine
translation. CoRR, abs/1611.02344.
Michael U. Gutmann and Aapo Hyva ̈rinen. 2012. Noise-contrastive estimation of unnormalized statistical models, with
applications to natural image statistics. J. Mach. Learn. Res., 13(1):307–361, February.
K. Hashimoto and Y. Tsuruoka. 2017. Neural Machine Translation with Source-Side Latent Graph Parsing. ArXiv e-prints,
February.
Zhongjun He, 2015. Proceedings of the Fourth Workshop on Hybrid Approaches to Translation (HyTra), chapter Baidu
Translate: Research and Products, pages 61–62. Association for Computational Linguistics.
Se ́bastien Jean, Kyunghyun Cho, Roland Memisevic, and Yoshua Bengio. 2015. On using very large target vocabulary for
neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and
the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1–10. Association for
Computational Linguistics.
Shihao Ji, S. V. N. Vishwanathan, Nadathur Satish, Michael J. Anderson, and Pradeep Dubey. 2015. Blackout: Speeding up
recurrent neural network language models with very large vocabularies. CoRR, abs/1511.06909.
Melvin Johnson, Mike Schuster, Quoc V. Le, Maxim Krikun, Yonghui Wu, Zhifeng Chen, Nikhil Thorat, Fer- nanda B. Vie ́gas,
Martin Wattenberg, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2016. Google’s multilingual neural machine
translation system: Enabling zero-shot translation. CoRR, abs/1611.04558.
Marcin Junczys-Dowmunt, Tomasz Dwojak, and Hieu Hoang. 2016. Is neural machine translation ready for deployment? A
case study on 30 translation directions. CoRR, abs/1610.01108.
Yoon Kim and M. Alexander Rush. 2016. Sequence-level knowledge distillation. In Proceedings of the 2016 Conference on
Empirical Methods in Natural Language Processing, pages 1317–1327. Association for Compu- tational Linguistics.
Natsuda Laokulrat, Sang Phan, Noriki Nishida, Raphael Shu, Yo Ehara, Naoaki Okazaki, Yusuke Miyao, and Hideki Nakayama.
2016. Generating video description using sequence-to-sequence model with temporal at- tention. In Proceedings of COLING
2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 44–52, Osaka, Japan,
December. The COLING 2016 Organizing Committee.
Jason Lee, Kyunghyun Cho, and Thomas Hofmann. 2016. Fully character-level neural machine translation with- out explicit
segmentation. CoRR, abs/1610.03017.
Fengfu Li and Bin Liu. 2016. Ternary weight networks. CoRR, abs/1605.04711.
138. 138
Peng Li, Yang Liu, Maosong Sun, Tatsuya Izuha, and Dakun Zhang. 2014. A neural reordering model for phrase- based
translation. In Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical
Papers, pages 1897–1907. Dublin City University and Association for Computational Linguistics.
Xiaoqing Li, Jiajun Zhang, and Chengqing Zong. 2016. Towards zero unknown word in neural machine transla- tion. In
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15
July 2016, pages 2852–2858.
Lemao Liu, Masao Utiyama, Andrew Finch, and Eiichiro Sumita. 2016. Neural machine translation with su- pervised
attention. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers,
pages 3093–3102. The COLING 2016 Organizing Committee.
Minh-Thang Luong and D. Christopher Manning. 2016. Achieving open vocabulary neural machine translation with hybrid
word-character models. In Proceedings of the 54th Annual Meeting of the Association for Compu- tational Linguistics
(Volume 1: Long Papers), pages 1054–1063. Association for Computational Linguistics.
Thang Luong, Hieu Pham, and D. Christopher Manning. 2015a. Effective approaches to attention-based neu- ral machine
translation. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pages 1412–1421.
Association for Computational Linguistics.
Thang Luong, Ilya Sutskever, Quoc Le, Oriol Vinyals, and Wojciech Zaremba. 2015b. Addressing the rare word problem in
neural machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and
the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 11–19. Association
for Computational Linguistics.
Valerio Antonio Miceli Barone and Giuseppe Attardi. 2015. Non-projective dependency-based pre-reordering with recurrent
neural network for machine translation. In Proceedings of the 53rd Annual Meeting of the Association for Computational
Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 846–
856. Association for Computational Linguistics.
Tomas Mikolov, Martin Karafia ́t, Luka ́s Burget, Jan Cernocky ́, and Sanjeev Khudanpur. 2010. Recurrent neural network
based language model. In INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication
Association, Makuhari, Chiba, Japan, September 26-30, 2010, pages 1045–1048.
Frederic Morin and Yoshua Bengio. 2005. Hierarchical probabilistic neural network language model. In Robert G. Cowell and
Zoubin Ghahramani, editors, Proceedings of the Tenth International Workshop on Artificial Intelli- gence and Statistics,
pages 246–252. Society for Artificial Intelligence and Statistics.
139. 139
Toshiaki Nakazawa, Hideya Mino, Chenchen Ding, Isao Goto, Graham Neubig, and Sadao Kurohashi, 2016a. Proceedings of
the 3rd Workshop on Asian Translation (WAT2016), chapter Overview of the 3rd Workshop on Asian Translation, pages 1–46.
Workshop on Asian Translation.
Toshiaki Nakazawa, Manabu Yaguchi, Kiyotaka Uchimoto, Masao Utiyama, Eiichiro Sumita, Sadao Kurohashi, and Hitoshi
Isahara. 2016b. Aspec: Asian scientific paper excerpt corpus. In Nicoletta Calzolari (Confer- ence Chair), Khalid Choukri,
Thierry Declerck, Marko Grobelnik, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, and Stelios Piperidis,
editors, Proceedings of the Ninth International Conference on Lan- guage Resources and Evaluation (LREC 2016), pages
2204–2208, Portoro, Slovenia, may. European Language Resources Association (ELRA).
Abigail See, Minh-Thang Luong, and D. Christopher Manning. 2016. Compression of neural machine translation models via
pruning. In Proceedings of The 20th SIGNLL Conference on Computational Natural Language Learning, pages 291–301.
Association for Computational Linguistics.
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016a. Controlling politeness in neural machine translation via side
constraints. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational
Linguistics: Human Language Technologies, pages 35–40, San Diego, California, June. Association for Computational
Linguistics.
Rico Sennrich, Barry Haddow, and Alexandra Birch. 2016b. Improving neural machine translation models with monolingual
data. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers),
pages 86–96, Berlin, Germany, August. Association for Computational Linguistics.Rico Sennrich, Barry Haddow, and
Alexandra Birch. 2016c. Neural machine translation of rare words with subword units. In Proceedings of the 54th Annual
Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1715–1725, Berlin, Germany,
August. Association for Computational Linguis- tics.
Shiqi Shen, Yong Cheng, Zhongjun He, Wei He, Hua Wu, Maosong Sun, and Yang Liu. 2016. Minimum risk training for neural
machine translation. In Proceedings of the 54th Annual Meeting of the Association for Com- putational Linguistics (Volume
1: Long Papers), pages 1683–1692. Association for Computational Linguistics.
Xing Shi, Kevin Knight, and Deniz Yuret. 2016a. Why neural translations are the right length. In Proceedings of the 2016
Conference on Empirical Methods in Natural Language Processing, pages 2278–2282. Association for Computational
Linguistics.
140. 140
Xing Shi, Inkit Padhi, and Kevin Knight. 2016b. Does string-based neural mt learn source syntax? In Proceedings of the 2016
Conference on Empirical Methods in Natural Language Processing, pages 1526–1534. Association for Computational
Linguistics.
Katsuhito Sudoh and Masaaki Nagata, 2016. Proceedings of the 3rd Workshop on Asian Translation (WAT2016), chapter
Chinese-to-Japanese Patent Machine Translation based on Syntactic Pre-ordering for WAT 2016, pages 211–215. Workshop
on Asian Translation.
Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Z. Ghahramani,
M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27,
pages 3104–3112. Curran Associates, Inc.
Akihiro Tamura, Taro Watanabe, and Eiichiro Sumita. 2014. Recurrent neural networks for word alignment model. In
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages
1470–1480. Association for Computational Linguistics.
Zhaopeng Tu, Yang Liu, Lifeng Shang, Xiaohua Liu, and Hang Li. 2016a. Neural machine translation with reconstruction. CoRR,
abs/1611.01874.
Zhaopeng Tu, Zhengdong Lu, Yang Liu, Xiaohua Liu, and Hang Li. 2016b. Modeling coverage for neural ma- chine translation.
In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages
76–85. Association for Computational Linguistics.
Oriol Vinyals, Ł ukasz Kaiser, Terry Koo, Slav Petrov, Ilya Sutskever, and Geoffrey Hinton. 2015. Grammar as a foreign
language. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information
Processing Systems 28, pages 2773–2781. Curran Associates, Inc.
David Weiss, Chris Alberti, Michael Collins, and Slav Petrov. 2015. Structured training for neural network transition-based
parsing. In Proceedings of the 53rd Annual Meeting of the Association for Computational Lin- guistics and the 7th
International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 323–333, Beijing, China, July.
Association for Computational Linguistics.
Jiaxiang Wu, Cong Leng, Yuhang Wang, Qinghao Hu, and Jian Cheng. 2016a. Quantized convolutional neural networks for
mobile devices. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
141. 141
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao,
Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, Lukasz Kaiser, Stephan Gouws, Yoshikiyo
Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason
Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. 2016b. Google’s neural machine
translation system: Bridging the gap between human and machine translation. CoRR, abs/1609.08144.
Kelvin Xu, Jimmy Ba, Ryan Kiros, Kyunghyun Cho, Aaron C. Courville, Ruslan Salakhutdinov, Richard S. Zemel, and Yoshua
Bengio. 2015. Show, attend and tell: Neural image caption generation with visual attention. CoRR, abs/1502.03044.
Barret Zoph and Kevin Knight. 2016. Multi-source neural translation. In Proceedings of the 2016 Conference of the North
American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 30–34.
Association for Computational Linguistics.
Barret Zoph, Deniz Yuret, Jonathan May, and Kevin Knight. 2016. Transfer learning for low-resource neural machine
translation. In Proceedings of the 2016 Conference on Empirical Methods in Natural Language Pro- cessing, pages 1568–
1575. Association for Computational Linguistics.