Scientists around the world (especially Google guys) are moving the approaches of Statistical Machine Translation (SMT) (e.g. word-based, statistical with phrase-based or hierarchical, syntax-based) to the next level, namely Neural Machine Translation.
In general, Neural Machine Translation aims to simplify the SMT approaches by taking the source as an input sequence and producing the target as an output sequence via a single, large neural networks.
Here I am trying to catch up the recent progress of Neural Machine Translation.
*** People & Group
1) LISA Lab, University of Montreal 2014 led by Prof.
Yoshua Bengio
Latest Demo: http://104.131.78.120/
*** Notable Papers
Note:
In addition, some other approaches utilized neural processing to enhance the current state-of-the-art SMT framework, for example:
*** For Language Model:
Note:
Comments
- It is quite hard to choose the optimized parameters (e.g. hidden layer nodes, input and output embedding dimensions) across data-sets and domains.
- In Moses, NPLM feature will slow down the decoder speed.
- It actually improves the translation performance when being used with n-gram LM features. But I am not sure whether it can completely replace n-gram LM features.
Note:
- Moses already has this feature.
3)
rwthlm - A toolkit for training neural network language models (feed-forward, recurrent, and long short-term memory neural networks). The software was written by
Martin Sundermeyer.
4) (to be updated)
*** For Translation Model:
Note:
- ACL 2014 best paper award.
- Accoding to the paper, they obtained a very impressive performance for Arabic-English Translation; good performance for Chinese-English Translation (datasets: OpenMT 2012, BOLT; domains: news, web forums).
- Moses already has this feature. Basic implementation of this model is already included in Moses under the name "
BilingualLM".
- NPLM can be used to train the models for this.
Comments
- Personally, I tried this model with Moses and evaluated with conversational domains (e.g. SMS, Chat, conversational telephone speech) using OpenMT'15 datasets. I obtained good (but not very impressive, 0.7-1.0 BLEU score) performance compared to basic baseline. Using this model together with other strong features did not give significantly better performance as said in the paper :(.
- Optimizing parameters for this model is an exhausted task.
3) (to be updated)
*** For Reordering Model:
1) Advancements in Reordering Models for Statistical Machine Translation (Minwei Feng et al., ACL 2013)
2) A Neural Reordering Model for Phrase-based Translation (Peng Li et al., COLING 2014)
3) (to be updated)