HOANG Cong Duy Vu's research logs: MTTK - Machine Translation Toolkit

Monday, 5 January 2015

MTTK - Machine Translation Toolkit

Link: http://mi.eng.cam.ac.uk/~wjb31/distrib/mttkv1/

Intro: MTTK is a collection of software tools for the alignment of parallel text for use in Statistical Machine Translation. With MTTK you can ...

Align document translation pairs at the sentence or sub-sentence level, sometimes known as chunking. This is a useful pre-processing step to prepare collections of translations for use in estimating the parameters of complex alignment models. Sub-sentence alignment in particular makes it possible to segment long sentences into shorter aligned segments that otherwise would have to be discarded.
Train statistical models for parallel text alignment. The following models are supported :
IBM Model-1 and Model-2
Word-to-Word HMMs
Word-to-Phrase HMMs , with bigram translation probabilities
Parallelize your model training procedures. If you have multiple CPUs available, you can partition your translation training texts into subsets, thus speeding up iterative parameter re-estimation procedures and reducing the amount of memory needed in training. This is done under exact EM-based parameter estimation procedures.
Generate word-to-word and word-to-phrase alignments of parallel text. MTTK can generate Viterbi alignments of parallel text (both training text and other texts) under the supported alignment models.
Extract word-to-word translation tables from aligned bitext and from the estimated models.
Extract phrase-to-phrase translation tables (phrase-pair inventories) from aligned parallel text.
Use the HMM alignment models to induce phrase translations under its statistical models. Phrase-pair induction can generate richer inventories of phrase translations than can be extracted from Viterbi alignments.
Edit the C++ source code to implement your own estimation and alignment procedures.

HOANG Cong Duy Vu's research logs

Monday, 5 January 2015

MTTK - Machine Translation Toolkit

No comments:

Post a Comment

Pages

My Blog List