Thursday 2 June 2011

Speller Challenge

http://web-ngram.research.microsoft.com/spellerchallenge/

May consider such big event!

--
Cheers,
Vu

SciVerse

http://www.info.sciverse.com/sciverse-applications
http://www.applications.sciverse.com/action/gallery

Elsevier SIGIR 2011 Application Challenge: http://developer.sciverse.com/SIGIR2011

Important Dates

+ Startdate: June 6, 2011
+ Enddate: July 23, 2011
+ Judging starts: July 24, 2011
+ Judging ends: July 26, 2011
+ Announcement of the Winners: July 26, 2011

Prizes

+ First prize: 1,500 USD (VISA gift card)
+ Second prize: 1,000 USD (VISA gift card)
+ Third prize: 500 USD (VISA gift card)

Wednesday 1 June 2011

Free Online File Converter

http://www.cometdocs.com/

Fantastic UI. Great Functionality with PDF to various file types (e.g. office files, HTML, ...).

Especially, it's free!

--
Cheers,
Vu

Topic Directory (~590K available categories so far)

http://www.dmoz.org/

The Open Directory Project is the largest, most comprehensive human-edited directory of the Web. It is constructed and maintained by a vast, global community of volunteer editors.

It is in multiple languages. Great!

--
Cheers,
Vu

Sunday 29 May 2011

Interested Papers at EMNLP 2011

Accepted Papers: http://conferences.inf.ed.ac.uk/emnlp2011/papers.html

1) A Weakly-supervised Approach to Argumentative Zoning of Scientific Documents
Yufan Guo, Anna Korhonen and Thierry Poibeau

2) Linear Text Segmentation Using Affinity Propagation
Anna Kazantseva and Stan Szpakowicz

3) Identifying Relations for Open Information Extraction
Anthony Fader, Stephen Soderland and Oren Etzioni

4) Active Learning with Amazon Mechanical Turk
Florian Laws, Christian Scheible and Hinrich Schütze

5) Extreme Extraction — Machine Reading in a Week
Marjorie Freedman, Lance Ramshaw, Elizabeth Boschee, Ryan Gabbard, Nicolas Ward and Ralph Weischedel

6) Discovering Relations between Noun Categories
Thahir Mohamed, Estevam Hruschka and Tom Mitchell

7) Bootstrapped Named Entity Recognition for Product Attribute Extraction
Duangmanee Putthividhya and Junling Hu

8) Predicting a Scientific Community’s Response to an Article
Dani Yogatama, Michael Heilman, Brendan O'Connor, Chris Dyer, Bryan R. Routledge and Noah A. Smith

9) Language Models for Machine Translation: Original vs. Translated Texts
Gennadi Lembersky, Noam Ordan and Shuly Wintner

10) Twitter Catches The Flu: Detecting Influenza Epidemics using Twitter
Eiji Aramaki, Sachiko Maskawa and Mizuki Morita

11) Rumor has it: Identifying Misinformation in Microblogs
Vahed Qazvinian, Emily Rosengren, Dragomir R. Radev and Qiaozhu Mei

SALM: Suffix Array and its Applications in Empirical Language Processing

Link: http://projectile.sv.cmu.edu/research/public/tools/salm/salm.htm
Another customized version: https://github.com/jhclark/salm

SALM is C++ package that provides functions to locate and estimates statistics of n-grams in a large corpus. SALM toolkit provides example applications such as estimating type/token frequency, locating n-gram occurrences, and a suffix array language model that can have arbitrarily long history for a very large training corpus.