Wednesday 5 January 2011

Scientext corpus

http://scientext.msh-alpes.fr

Scientext is a new, on-line French and English corpus of scientific texts. The corpus includes 4.8 million running tokens in French, 13 million words of research articles in English (medicine and biology), and an English-language sub-corpus of French undergraduate students’ texts (1,1 million words). The corpus is organized to facilitate the linguistic study of authorial position and reasoning in scientific articles through phraseology and lexico-grammatical markers linked to causality.

No comments:

Post a Comment