Monday 18 April 2011

Tools for Vietnamese Spell Checking

1) Hunspell (Vietnamese version): http://code.google.com/p/hunspell-spellcheck-vi/

Original version: http://hunspell.sourceforge.net/

Source Code with Java for Hunspell: http://dren.dk/hunspell.html

2) Aspell: http://aspell.net/

Available dictionaries: ftp://ftp.gnu.org/gnu/aspell/dict/0index.html

3) IBM csSpell (Context-sensitive Spelling Checker): http://www.alphaworks.ibm.com/tech/csspell

4) TBA

--
Cheers,
Vu

N-gram tools

1) http://homepages.inf.ed.ac.uk/lzhang10/ngram.html

2) Google Web N-gram
2a) Google Web N-gram Viewer: http://ngrams.googlelabs.com/
2b) Google Web N-gram Patterns: http://n-gram-patterns.sourceforge.net/

3) Microsoft Web N-gram: http://web-ngram.research.microsoft.com/info/

4) N-gram Statistics Package: http://ngram.sourceforge.net/

5) CMU Language Modeling Toolkit (version 2): http://www.speech.cs.cmu.edu/SLM/toolkit.html


Tools for corpus statistics

Thanks to Corpora-List member, I compiled the following list of tools for corpus statistics:

1) TMX software: https://sourceforge.net/projects/textometrie

2) R: www.r-project.org

With books accompanied:

http://www.amazon.com/dp/3110205645
http://www.amazon.com/dp/0415962706

3) Lexico3: http://www.tal.univ-paris3.fr/lexico/lexico3.htm (seemingly a commercial tool)

4) TBA

If you know others, please let me know!

--
Cheers,
Vu

Relax about coding

http://www.scribd.com/doc/38648591/About-Coders-new-version