Thursday, 10 June 2010

Selected papers at ACL 2010

Machine Translation

Pseudo-word for Phrase-based Machine Translation
Xiangyu Duan, Min Zhang and Haizhou Li

Learning Lexicalized Reordering Models from Reordering Graphs
Jinsong Su, Yang Liu, Yajuan Lv, Haitao Mi and Qun Liu

Filtering Syntactic Constraints for Statistical Machine Translation
Hailong Cao and Eiichiro Sumita

Error Detection for Statistical Machine Translation Using Linguistic Features
Deyi Xiong, Min Zhang and Haizhou Li

Boosting-based System Combination for Machine Translation
Tong Xiao, Jingbo Zhu, Muhua Zhu and Huizhen Wang

Bilingual Sense Similarity for Statistical Machine Translation
Boxing Chen, George Foster and Roland Kuhn

Summarization & Generation

A Risk Minimization Framework for Extractive Speech Summarization
Shih-Hsiang Lin and Berlin Chen

Entity-based local coherence modelling using topological fields
Jackie Chi Kit Cheung and Gerald Penn

Automatic Collocation Suggestion in Academic Writing
Jian-Cheng Wu, Yu-Chia Chang, Teruko Mitamura and Jason S. Chang

Identifying Non-Explicit Citing Sentences for Citation-Based Summarization.
Vahed Qazvinian and Dragomir R. Radev

Automatic Generation of Story Highlights
Kristian Woodsend and Mirella Lapata

Plot Induction and Evolutionary Search for Story Generation
Neil McIntyre and Mirella Lapata

Metadata-Aware Measures for Answer Summarization in Community Question Answering
Mattia Tomasoni and Minlie Huang

A Hybrid Hierarchical Model for Multi-Document Summarization
Asli Celikyilmaz and Dilek Hakkani-Tur

A new Approach to Improving Multilingual Summarization using a Genetic Algorithm
Marina Litvak, Mark Last and Menahem Friedman

Cross-Language Document Summarization Based on Machine Translation Quality Prediction
Xiaojun Wan, Huiying Li and Jianguo Xiao

Generating image descriptions using dependency relational patterns
Ahmet Aker and Robert Gaizauskas

Information Extraction

Open Information Extraction Using Wikipedia
Fei Wu and Daniel S. Weld


The Human Language Project: Building a Universal Corpus of the World’s Languages
Steven Abney and Steven Bird

(see full list of papers at

Wednesday, 9 June 2010

Topic summarization

Given a scenario in which the system takes the input with a research topic and needs to generate a summary of related works relevant to that topic automatically.
--> I think this research problem is still open and actually very challenging. It requires advanced processing which combines many fields in AI such as: NLP, IR, IE, ...

Some initial works (including mine) as follows:
1) Scientific Paper Summarization Using Citation Summary Networks by Qazvinian V. et al. (COLING 2008).
--> this work only targets single article summarization using a clustering approach based on citation summary networks.
2) Generating surveys of scientific paradigms by Saif Mohammad et al. (NAACL 2009).
--> this work explores the usefulness of citation summary in compared to summary from abstracts or full text of articles.
3) Towards Automated Related Work Summarization by Cong Duy Vu HOANG et al. (COLING 2010)
--> this work does not use citation summary but tries to take advantage of full text of article in generating related work summary.
It makes a strong assumption that each related work summary follows a topic hierarchy tree which is provided as the input of summarization system. The system then proposes two different strategies (general & specific content summarization) based on manual rhetorical analysis on how humans use topic hierarchy tree to generate related work summary.
4) Identifying Non-Explicit Citing Sentences for Citation-Based Summarization by Vahed Qazvinian and Dragomir R. Radev (ACL 2010)
--> TBA
5) Context Identification of Sentences in Related Work Sections using a Conditional Random Field: Towards Intelligent Digital Libraries by Angrosh M. A. et al. (JCDL 2010)
6) Imitating Human Literature Review Writing: An Approach to Multi-document Summarization by Jaidka K. et al. (ICADL 2010)
7) Analysis of the Macro-Level Discourse Structure of Literature Reviews by Jaidka K. et al. (Online Information Review)
8) Ultimate Research Assistant:
9) iResearch Reporter:
10) TBA

Future works (what I come up in my mind now) includes:
- Given a research topic --> automatically generate a topic hierarchy tree of that topic.
- A systematic comparison of summaries built from citations, abstracts, full text of articles. Which ones are more useful to users?
- An initial add-in component integrated into online ACL anthology system.
- Some other issues improve the summarization performance (i.e. use rhetorical discourse analysis, ...)
- ...