书籍详情
神经机器翻译的联合训练(英文版)
作者:程勇 著
出版社:清华大学出版社
出版时间:2020-08-01
ISBN:9787302561491
定价:¥69.00
购买这本书可以去
内容简介
标准的神经机器翻译模型通常是构建一个源语言到目标语言的翻译模型。在本论文中,我们将提出一些联合训练两个神经机器翻译模型的方法,其中包括以下主题:1.改进注意力模型;2.引入单语语料;3.提升基于轴语言的翻译;4.整合双向依赖关系。
作者简介
2012年毕业于北京交通大学,2017年取得清华大学大学工学博士学位,2017年加入腾讯担任高级研究员,主要研究领域为机器翻译,在国际重要会议诸如ACL、IJCAI、AAAI等发表论文10多篇。
目录
Contents
1 Neural Machine Translation 1
1.1 Introduction 1
1.2 Neural Machine Translation 4
References 8
2 Agreement-Based Joint Training for Bidirectional Attention-Based
Neural Machine Translation 11
2.1 Introduction 11
2.2 Agreement-Based Joint Training 12
2.3 Experiments 16
2.3.1 Setup 16
2.3.2 Comparison of Loss Functions 17
2.3.3 Results on Chinese-English Translation 18
2.3.4 Results on Chinese-English Alignment 18
2.3.5 Analysis of Alignment Matrices 19
2.3.6 Results on English-to-French Translation 21
2.4 Summary 22
References 22
3 Semi-supervised Learning for Neural Machine Translation 25
3.1 Introduction 25
3.2 Semi-supervised Learning for Neural Machine Translation 27
3.2.1 Supervised Learning 27
3.2.2 Autoencoders on Monolingual Corpora 27
3.2.3 Semi-supervised Learning 29
3.2.4 Training 30
3.3 Experiments 31
3.3.1 Setup 31
3.3.2 Effect of Sample Size k 32
3.3.3 Effect of OOV Ratio 34
3.3.4 Comparison with SMT . . . . . . . . . . . . . . . . . . . . . . . . . . .35
3.3.5 Comparison with Previous Work . . . . . . . . . . . . . . . . . . .36
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .394 Joint Training for Pivot-Based Neural Machine Translation . . . . . . 41
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.2 Pivot-Based NMT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.3 Joint Training for Pivot-Based NMT . . . . . . . . . . . . . . . . . . . . . . 45
4.3.1 Training Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3.2 Connection Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3.3 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .46
.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48
4.4.2 Results on the Europarl Corpus . . . . . . . . . . . . . . . . . . . . 49
4.4.3 Results on the WMT Corpus . . . . . . . . . . . . . . . . . . . . . . 50
4.4.4 Effect of Bridging Corpora . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53
5 Joint Modeling for Bidirectional Neural Machine Translation with Contrastive Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Unidirectional Neural Machine Translation. . . . . . . . . . . . . . . . . . 57
5.3 Bidirectional Neural Machine Translation. . . . . . . . . . . . . . . . . . .57
5.4 Decoding Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.5.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61
5.5.2 Effect of Translation Strategies . . . . . . . . . . . . . . . . . . . . .62
5.5.3 Comparison with SMT and Standard NMT . . . . . . . . . . . . 63
5.5.4 BLEU Scores Over Sentence Length . . . . . . . . . . . . . . . . 64
5.5.5 Comparison of Learning Curves . . . . . . . . . . . . . . . . . . . . 65
5.5.6 Analysis of Expected Embeddings . . . . . . . . . . . . . . . . . . 66
5.5.7 Results on English-German Translation . . . . . . . . . . . . . . .66
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67
6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.1 Attentional Mechanisms in Neural Machine Translation . . . . . . . . 69
6.2 Capturing Bidirectional Dependencies . . . . . . . . . . . . . . . . . . . . .70
6.2.1 Capturing Bidirectional Dependencies . . . . . . . . . . . . . . . . 70
6.2.2 Agreement-Based Learning . . . . . . . . . . . . . . . . . . . . . . .70
6.3 Incorporating Additional Data Resources 71
6.3.1 Exploiting Monolingual Corpora for Machine
Translation 71
6.3.2 Autoencoders in Unsupervised and Semi-supervised
Learning 71
6.3.3 Machine Translation with Pivot Languages 72
6.4 Contrastive Learning 72
References 72
7 Conclusion 75
7.1 Conclusion 75
7.2 Future Directions 76
7.2.1 Joint Modeling 76
7.2.2 Joint Training 77
7.2.3 More Tasks 78
References 78
1 Neural Machine Translation 1
1.1 Introduction 1
1.2 Neural Machine Translation 4
References 8
2 Agreement-Based Joint Training for Bidirectional Attention-Based
Neural Machine Translation 11
2.1 Introduction 11
2.2 Agreement-Based Joint Training 12
2.3 Experiments 16
2.3.1 Setup 16
2.3.2 Comparison of Loss Functions 17
2.3.3 Results on Chinese-English Translation 18
2.3.4 Results on Chinese-English Alignment 18
2.3.5 Analysis of Alignment Matrices 19
2.3.6 Results on English-to-French Translation 21
2.4 Summary 22
References 22
3 Semi-supervised Learning for Neural Machine Translation 25
3.1 Introduction 25
3.2 Semi-supervised Learning for Neural Machine Translation 27
3.2.1 Supervised Learning 27
3.2.2 Autoencoders on Monolingual Corpora 27
3.2.3 Semi-supervised Learning 29
3.2.4 Training 30
3.3 Experiments 31
3.3.1 Setup 31
3.3.2 Effect of Sample Size k 32
3.3.3 Effect of OOV Ratio 34
3.3.4 Comparison with SMT . . . . . . . . . . . . . . . . . . . . . . . . . . .35
3.3.5 Comparison with Previous Work . . . . . . . . . . . . . . . . . . .36
3.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .394 Joint Training for Pivot-Based Neural Machine Translation . . . . . . 41
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 414.2 Pivot-Based NMT. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.3 Joint Training for Pivot-Based NMT . . . . . . . . . . . . . . . . . . . . . . 45
4.3.1 Training Objective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3.2 Connection Terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.3.3 Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .46
.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.4.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48
4.4.2 Results on the Europarl Corpus . . . . . . . . . . . . . . . . . . . . 49
4.4.3 Results on the WMT Corpus . . . . . . . . . . . . . . . . . . . . . . 50
4.4.4 Effect of Bridging Corpora . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .53
5 Joint Modeling for Bidirectional Neural Machine Translation with Contrastive Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.2 Unidirectional Neural Machine Translation. . . . . . . . . . . . . . . . . . 57
5.3 Bidirectional Neural Machine Translation. . . . . . . . . . . . . . . . . . .57
5.4 Decoding Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.5.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .61
5.5.2 Effect of Translation Strategies . . . . . . . . . . . . . . . . . . . . .62
5.5.3 Comparison with SMT and Standard NMT . . . . . . . . . . . . 63
5.5.4 BLEU Scores Over Sentence Length . . . . . . . . . . . . . . . . 64
5.5.5 Comparison of Learning Curves . . . . . . . . . . . . . . . . . . . . 65
5.5.6 Analysis of Expected Embeddings . . . . . . . . . . . . . . . . . . 66
5.5.7 Results on English-German Translation . . . . . . . . . . . . . . .66
5.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .67
6 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.1 Attentional Mechanisms in Neural Machine Translation . . . . . . . . 69
6.2 Capturing Bidirectional Dependencies . . . . . . . . . . . . . . . . . . . . .70
6.2.1 Capturing Bidirectional Dependencies . . . . . . . . . . . . . . . . 70
6.2.2 Agreement-Based Learning . . . . . . . . . . . . . . . . . . . . . . .70
6.3 Incorporating Additional Data Resources 71
6.3.1 Exploiting Monolingual Corpora for Machine
Translation 71
6.3.2 Autoencoders in Unsupervised and Semi-supervised
Learning 71
6.3.3 Machine Translation with Pivot Languages 72
6.4 Contrastive Learning 72
References 72
7 Conclusion 75
7.1 Conclusion 75
7.2 Future Directions 76
7.2.1 Joint Modeling 76
7.2.2 Joint Training 77
7.2.3 More Tasks 78
References 78
猜您喜欢