Friday, 10 July 2009

Soft skills for scientific research

Very useful writings (just in Vietnamese):

I think that I have been experiencing similar circumstances during my research life though this is just a beginning ^_^. As shown in two above writings, we should be diligent, patient, and regular to maintain our passion for scientific research. Hopefully, I will overcome all of the most difficulties to realize my dream (become a well-skilled research scientist ^_^, still far away from present).


Thursday, 9 July 2009

Human computation

Manual data annotation methods need much efforts in terms of time-consuming, labor-intensive and error-prone process. Recently, human computation has emerged as a viable synergy for data annotation of which idea is to harness what humans are good at but machines are poor at. Currently, many tasks are trivial for humans but continue to challenge even the most sophisticated computer programs. Thus, the intelligent combination between computers and humans in terms of human computation to solve complex tasks is becoming a promising approach. Two typical frameworks representing human computation for data annotation are Games With A Purpose (GWAP) and Amazon Mechanical Turk (AMT).

During my earlier research at SoC@NUS, I had a chance to undertake an analysis survey on human computation. I will post it on here as soon as possible for your quick reference.

I think that human computation can become a primary research tool for quickly creating evaluation data in the future.


Tuesday, 7 July 2009

How to read a scientific article?

Do you think you already read scientific articles in terms of effectiveness and efficiency? If not, you can read some of the following articles: (very nice presentation :D)

For a newbie in research like me, it is extremely important to learn about.



I have learned one lesson from my adviser.

Sometimes you are thinking about the problem and want to figure out the solution for it. The best way to do this is that you should firstly think of some hypotheses for your problem theoretically. You will then validate your hypotheses empirically based on some experiments. Explain and analyze why the results look like. It is worthy noticing that sometimes your data may not support for your hypotheses.

Please do not do experiments only, it is not helpful.

Hopefully, it helps me a lot then. Keep up with my best effort in my research.


Tools supporting our brainstorming

Topic modeling toolkit

A famous tool supporting various algorithms in exploring latent topics in raw texts.


Summarization toolkit

MEAD: developed by Dragomir Radev at Univ. of Michigan
- Using Centroid-based summarization algorithm, read more details in some related papers
- Some troubles will be encountered when installing this tool. Please read the README file carefully before using it.
- Useful FAQs relating to MEAD:

ROUGE evaluation:


Machine Learning tools

Various machine learning algorithms like SVM, Bayes, decision tools, ... are integrated in this tool. It also supports visualization of machine learning data, very efficient and effective for observation and analysis.


Reference Manager Tools (part 1)

There are currently many available tools that support the process of managing references in scientific research. I would like to introduce some of them which are quite good according to my opinion, as follows:

JabRef: The JabRef output can be flexibly customized according to various file formats (e.g. HTML, PDF, ...), see the following figure for a typical example (thanks Zhao Shanheng for sharing this template file):

Zotero (Firefox addin):

ForeCiteNote: . This is one of exciting projects undertaken by WING research group at SoC@NUS. Try it!


upcoming NLP conferences in 2010

- ACL 2010 ( [rank 1]
- NAACL 2010 ( [rank 1]
- EMNLP 2010 [rank 2]
- COLING 2010 ( [rank 2]
- CICLing 2010 [rank 3]
- AI related (NLP tracks):
+ AAAI'10 ( [rank 1]
+ ECAI'10 ( [rank 2]
+ PRICAI'10 ( [rank 3]
+ ICTAI'10 [rank 2]


Note that I will update their deadlines of paper submission as soon as possible :D.


LaTeX and its related issues

Templates for LaTeX for beginners:
ACM templates:

LaTeX editors:
Texmaker: (cross-platform)
TeXstudio: (cross-platform)

(support automatic generation of LaTeX codes, very effective)

Tools supporting vector graphics:

LaTeX tutorials: a great LaTeX site

LaTeX resources:

LaTeX community:

LaTeX tips (ubiquitous):

1) Quotation Marks and Dashes

Single quotation marks are produced in LaTeX using ` and '. Double quotation marks are produced by typing `` and ''. (The `undirected double quote character " produces double right quotation marks: it should never be used where left quotation marks are required.)

LaTeX allows you to produce dashes of various length, known as `hyphens', `en-dashes' and `em-dashes'. Hyphens are obtained in LaTeX by typing -, en-dashes by typing -- and em-dashes by typing ---.

One normally uses en-dashes when specifying a range of numbers. Thus for example, to specify a range of page numbers, one would type

on pages 155--219.

Dashes used for punctuating are often typeset as em-dashes, especially in older books. These are obtained by typing ---.

(to be continued)

2) LaTeX mathematical equation editor: -> an interactive tool, very cool!

3) Some strategies to include graphics in LaTeX documents

4) Using Visio to create EPS files (very helpful) (driver of EPS files for printer in Visio)

5) Rename "Contents" by "Tables of Contents" using
\renewcommand\contentsname{Table of Contents}

6) Special characters in LaTeX:

7) Spell Checking for LaTeX documents

8) Footnote with caption

9) LaTeX tables with bar charts

10) LaTeX mathematics equations tips:

11) ... (to be added)