Monday, 8 November 2010

Searchable Parallel Corpora

1. CABAL: Un concordancier en ligne pour la linguistique contrastive

http://cabal.rezo.net/ (University of Poitiers)

English, French

Environ 200 articles sont actuellement en ligne (soit environ 400 000 mots). La majorité sont issus du Monde diplomatique et datés de 1998 à décembre 2003.

2. The CLUVI corpus:

http://sli.uvigo.es/CLUVI/index_en.html

English, French, Spanish, Galician,

Corpus: UNESCO Corpus of English-Galician-French-Spanish scientific-technical divulgation

3. German(-English) parallel corpora (Europarl and German News)

http://corpus.leeds.ac.uk/paraquery.html

English, German

4. WebTCE (Translation Corpus Explorer)

http://khnt.hit.uib.no/webtce.htm

English, German, French, Spanish, Norwegian, Danish

5. EVROKORPUS Parallel corpora

http://evrokorpus.gov.si/index.php?jezik=angl

223 million words. English, French, German, Italian, Slovene and Spanish. Searches must involve Slovene and one other language.

6. TERMACOR terminology and corpus

http://evrokorpus.gov.si/k2/index.php?jezik=angl

98 million words in 22 European Languages. EU Commission data.

7. COMPARA Portuguese-English parallel corpus

http://www.linguateca.pt/COMPARA/

Three million words.

Portuguese, English

8. Termsearch

http://www.termsearch.info/ or a faster interface at:

http://www.bible-study-in-geneva.info/termsearch/

English, French, Russian

Major international treaties, conventions, agreements, etc. 792 documents.

9. English-Inuktitut Parallel Corpus

http://www.inuktitutcomputing.ca/NunavutHansard/en/

3.5 million words (of English), 1.5 million words of Inuktitut

English, Inuktitut (an Inuit Language of North-Eastern Canada)

10. English-Russian Parallel Corpus

http://ruscorpora.ru/search-para.html

English, Russian, (some German?)

Interface only in Russian.

About 9 million words

11. http://WeBiText.com

12. http://www.linguee.de/

13. http://corpus.consumer.es/corpus/

14. OPUS:
http://www.let.rug.nl/tiedeman/OPUS/
http://www.let.rug.nl/tiedeman/OPUS/bin/opuscqp.pl

15. LinearB

http://linearb.co.uk/

16. MyMemories

http://mymemory.translated.net/

17. Natura corpora

http://linguateca.di.uminho.pt/nat/nat.pl

18. Compara

http://www.linguateca.pt/COMPARA/Welcome.html

19. CLUVI

http://sli.uvigo.es/CLUVI/

20. English-Chinese

http://score.crpp.nie.edu.sg/cgi-bin/babel/paraconc.pl

21. TransSearch
http://www.tsrali.com/

22. English-Vietnamese
http://hellochao.com/
http://linkdict.com/Bilingual/

23. WebLitera - free international online book library

(to be updated)