Monday 27 August 2012

ACCURAT Toolkit

Intro: The ACCURAT project (http://www.accurat-project.eu/) is pleased to announce the release of ACCURAT Toolkit - a collection of tools for comparable corpora collection and multi-level alignment and information extraction from comparable corpora. By using the ACCURAT Toolkit, users may obtain:
- Comparable corpora from the Web (current news corpora, filtered Wikipedia corpora, and narrow domain focussed corpora);
- Comparable document alignments;
- Semi-parallel sentence/phrase mapping from comparable corpora (for SMT training purposes or other tasks);
- Translated terminology extracted and mapped from bilingual comparable corpora;
- Translated named entities extracted and mapped from bilingual comparable corpora.

No comments:

Post a Comment