http://www.cs.iastate.edu/~honavar/grad-advice.html
--
Cheers,
Vu
Friday, 4 December 2009
Tuesday, 1 December 2009
flex & bison
http://oreilly.com/catalog/9780596155988
This is the first time I know this. Is it useful to learn?
--
Cheer,
Vu
This is the first time I know this. Is it useful to learn?
--
Cheer,
Vu
Saturday, 28 November 2009
Wednesday, 18 November 2009
Thursday, 12 November 2009
Tuesday, 10 November 2009
Sunday, 8 November 2009
Friday, 6 November 2009
The Google Technology Stack
http://michaelnielsen.org/blog/lecture-course-the-google-technology-stack/ by Michael Nielsen.
Should spend more time to study about this!
--
Cheers,
Vu
Should spend more time to study about this!
--
Cheers,
Vu
MyMemory
http://mymemory.translated.net/ --> translations between various languages.
It's so great!
--
Cheers,
Vu
It's so great!
--
Cheers,
Vu
Labels:
computational linguistics,
links,
research,
translations
Wednesday, 4 November 2009
Saturday, 31 October 2009
Thursday, 29 October 2009
New book "Probabilistic Graphical Models" (2009)
Probabilistic Graphical Models (more)
Principles and Techniques
Daphne Koller and Nir Friedman, 2009
--
Cheers,
Vu
Principles and Techniques
Daphne Koller and Nir Friedman, 2009
--
Cheers,
Vu
Labels:
book,
graphical models,
links,
machine learning,
research
Wednesday, 28 October 2009
Project Gutenberg
Project Gutenberg (http://www.gutenberg.org/wiki/Main_Page) - a site containing more than 100,000 free online books in various languages ^_^. Especially, it allows to access the raw texts from books for further processing (e.g. book summarization - a very interesting research direction which has been underestimated so far).
--
Cheers,
Vu
--
Cheers,
Vu
Tuesday, 27 October 2009
Reference Management Tools (part 2)
Just want to collect more information about reference management tools.
Part 1: here
1) BibDesk (http://bibdesk.sf.net) for Mac
2) Mendeley: http://www.mendeley.com/
If you know any others, please suggest me! Thanks in advance!
--
Cheers,
Vu
Part 1: here
1) BibDesk (http://bibdesk.sf.net) for Mac
2) Mendeley: http://www.mendeley.com/
If you know any others, please suggest me! Thanks in advance!
--
Cheers,
Vu
Sunday, 25 October 2009
Remarkable papers for Vietnamese Word Segmentation
By Doan NGUYEN (Hewlett-Packard Company)
1) "Query Preprocessing: Improving Web Search Through a Vietnamese Word Tokenization Approach". SIGIR'08 (short paper)
2) "Using Search Engine to Construct a Scalable Corpus for Vietnamese Lexical Development for Word Segmentation". Proceedings of the 7th Workshop on Asian Language Resources, ACL-IJCNLP 2009.
--
Cheers,
Vu
1) "Query Preprocessing: Improving Web Search Through a Vietnamese Word Tokenization Approach". SIGIR'08 (short paper)
2) "Using Search Engine to Construct a Scalable Corpus for Vietnamese Lexical Development for Word Segmentation". Proceedings of the 7th Workshop on Asian Language Resources, ACL-IJCNLP 2009.
--
Cheers,
Vu
Saturday, 24 October 2009
The Legacy of Randy Pausch
Prof. Randy has been very famous with two lectures: "Time Management" and "The Last Lecture" (MUST SEE THEM!).
http://www.cs.virginia.edu/~robins/Randy/ - a collective site about Prof. Randy's lectures.
--
Cheers,
Vu
http://www.cs.virginia.edu/~robins/Randy/ - a collective site about Prof. Randy's lectures.
--
Cheers,
Vu
Thursday, 22 October 2009
Surveys & Books for Automatic Summarization
This post is to collect some useful surveys or books about automatic summarization in the literature so far.
Surveys
1) A survey on Automatic Text Summarization (link)
2) A Survey on Multi-Document Summarization (link)
3) Automatic summarising: The state of the art (link)
Books
1) Automatic Summarization by Mani (link)
2) Text summarisation by Hovy (link)
--
Cheers,
Vu
Surveys
1) A survey on Automatic Text Summarization (link)
2) A Survey on Multi-Document Summarization (link)
3) Automatic summarising: The state of the art (link)
Books
1) Automatic Summarization by Mani (link)
2) Text summarisation by Hovy (link)
--
Cheers,
Vu
Tuesday, 6 October 2009
Modeling and Reasoning with Bayesian Networks
New book in 2009 about Bayesian Networks: "Modeling and Reasoning with Bayesian Networks" by Prof. Adnan Darwiche .
Link download: http://gigapedia.com/items/346116/modeling-and-reasoning-with-bayesian-networks (make sure that you already log in into that site)
Link on Amazon website: http://www.amazon.com/Modeling-Reasoning-Bayesian-Networks-Professor/dp/0521884381/ref=sr_1_1?ie=UTF8&s=books&qid=1239663545&sr=8-1
I am trying to read some of chapters in this book to study about Bayesian Networks and find that it is written with a very comprehensive and straightforward fashion, especially help us easily deal with a lot of mathematical materials in Bayesian Networks with accompanying examples. In my opinion, it may be better than the typical book in Bayesian Networks, namely "Learning Bayesian Networks" by Richard E. Neapolitan, which is very tough to read and understand.
--
Cheers,
Vu
Monday, 5 October 2009
Monday, 28 September 2009
SynView - A free syntax tree visualization tool
Introduction trailer on YouTube: http://www.youtube.com/watch?v=zFi9ldFYlEs
Extremely awesome with 3D manipulation. It may be helpful for people who regularly deal with syntax tree in NLP.
--
Cheers,
Vu
Sunday, 27 September 2009
True Knowledge - The Internet Answer Engine
http://www.trueknowledge.com/
Just a beta version but very impressive about what it can do for us.
--
Cheers,
Vu
Just a beta version but very impressive about what it can do for us.
--
Cheers,
Vu
Labels:
question answering,
research,
search,
search engines
Saturday, 26 September 2009
Friday, 25 September 2009
Open source search engine
The below is some collected information relevant to up-to-date open source search engine toolkits:
1) Lucene: http://lucene.apache.org/ (Java)
--> CLucene: http://sourceforge.net/projects/clucene/ (C++)
2) Minion: https://minion.dev.java.net
3) Galago: http://www.galagosearch.org/
4) Xapian: http://xapian.org/ (C++)
5) http://www.searchenginecaffe.com/2007/03/open-source-search-engines-in-java-and.html (very informative)
6) Minion vs. Lucene: http://blogs.sun.com/searchguy/entry/minion_and_lucene_query_languages
7) Search Engine Wrapper (Yee Fan Tan - NUS): http://wing.comp.nus.edu.sg/~tanyeefa/downloads/searchenginewrapper/
--
Cheers,
Vu
1) Lucene: http://lucene.apache.org/ (Java)
--> CLucene: http://sourceforge.net/projects/clucene/ (C++)
2) Minion: https://minion.dev.java.net
3) Galago: http://www.galagosearch.org/
4) Xapian: http://xapian.org/ (C++)
5) http://www.searchenginecaffe.com/2007/03/open-source-search-engines-in-java-and.html (very informative)
6) Minion vs. Lucene: http://blogs.sun.com/searchguy/entry/minion_and_lucene_query_languages
7) Search Engine Wrapper (Yee Fan Tan - NUS): http://wing.comp.nus.edu.sg/~tanyeefa/downloads/searchenginewrapper/
--
Cheers,
Vu
Wednesday, 16 September 2009
Bayesian Inference with Tears
Will plan to read this article to understand more about Bayesian inference applied to NLP.
Link: http://www.isi.edu/natural-language/people/bayes-with-tears.pdf
(by Kevin Knight)
--
Cheers,
Vu
Link: http://www.isi.edu/natural-language/people/bayes-with-tears.pdf
(by Kevin Knight)
--
Cheers,
Vu
Labels:
Bayesian inference,
machine learning,
NLP,
research
Wednesday, 2 September 2009
Notes in machine learning
http://www.ics.uci.edu/~welling/classnotes/classnotes.html
Think such notes are very useful for me to learn more about topics in machine learning.
Useful datasets for Machine Learning: http://archive.ics.uci.edu/ml/
--
Cheers,
Vu
Think such notes are very useful for me to learn more about topics in machine learning.
Useful datasets for Machine Learning: http://archive.ics.uci.edu/ml/
--
Cheers,
Vu
Tuesday, 1 September 2009
Markov Logic Networks
Markov Logic Networks, a combination of First Order Logic and Markov Networks, is a new graphical model which will be very important for AI modeling in the future. Prof. Pedro Domingos at Univ. of Washington is a pioneer in this field. There are some major references given by him:
1) New book "Markov Logic - An Interface Layer for AI".
Another editorial book: "Integrating Logic and Statistics: Novel Algorithms in Markov Logic Networks" by Marenglen Biba
2) The course about Markov Logic Networks given by Prof. Pedro Domingos at Univ. of Washington.
3) The article "What's missing in AI - The Interface Layer".
4) Alchemy - Open source AI: http://alchemy.cs.washington.edu/
I wonder whether some NLP problems can benefit from such a new model.
--
Cheers,
Vu
1) New book "Markov Logic - An Interface Layer for AI".
Another editorial book: "Integrating Logic and Statistics: Novel Algorithms in Markov Logic Networks" by Marenglen Biba
2) The course about Markov Logic Networks given by Prof. Pedro Domingos at Univ. of Washington.
3) The article "What's missing in AI - The Interface Layer".
4) Alchemy - Open source AI: http://alchemy.cs.washington.edu/
I wonder whether some NLP problems can benefit from such a new model.
--
Cheers,
Vu
Labels:
machine learning,
Markov Logic Network,
NLP,
research
Graphical Models in a Nutshell
The paper by Prof. Daphne Koller :
http://robotics.stanford.edu/~koller/Papers/Koller+al:SRL07.pdf
MUST read this paper to understand the underlying principles behind graphical models before proceeding to investigate more!
--
Cheers,
Vu
http://robotics.stanford.edu/~koller/Papers/Koller+al:SRL07.pdf
MUST read this paper to understand the underlying principles behind graphical models before proceeding to investigate more!
--
Cheers,
Vu
Monday, 31 August 2009
Probability and Logic
Combining Probability and Logic - Journal of Applied Logic
This article is generally about how to use probability in combination with logic, also called language logic or metalanguage. It has been proven that probabilistic approaches have been getting more and more important in the field of Natural Language Processing and text processing.
Should read this article!
--
Cheers,
Vu
This article is generally about how to use probability in combination with logic, also called language logic or metalanguage. It has been proven that probabilistic approaches have been getting more and more important in the field of Natural Language Processing and text processing.
Should read this article!
--
Cheers,
Vu
Sunday, 30 August 2009
Sunday, 16 August 2009
NLG systems
1) SimpleNLG (2009)
http://code.google.com/p/simplenlg/
2) FUF/SURGE
FUF: Functional Unification Formalism Interpreter
SURGE: A Syntactic Realization Grammar for Text Generation
http://www.cs.bgu.ac.il/surge/index.html (1999)
http://homepages.inf.ed.ac.uk/ccallawa/resources.html (newest version, 2005)
3) More in http://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems
--
Cheers,
Vu
http://code.google.com/p/simplenlg/
2) FUF/SURGE
FUF: Functional Unification Formalism Interpreter
SURGE: A Syntactic Realization Grammar for Text Generation
http://www.cs.bgu.ac.il/surge/index.html (1999)
http://homepages.inf.ed.ac.uk/ccallawa/resources.html (newest version, 2005)
3) More in http://www.aclweb.org/aclwiki/index.php?title=Downloadable_NLG_systems
--
Cheers,
Vu
Friday, 14 August 2009
Mojo Web Framework
http://mojolicious.org/-A next generation web framework for the Perl programming language.
Wednesday, 5 August 2009
Markov Logic
Markov Logic - a new graphical model for Natural Language Processing
Should study about this model as soon as possible!
--
Cheers,
Vu
New book by Prof. Pedro Domingos: http://www.morganclaypool.com/doi/abs/10.2200/S00206ED1V01Y200907AIM007
Should study about this model as soon as possible!
--
Cheers,
Vu
Sunday, 2 August 2009
ACL-IJCNLP'09 participation
Day 1 - 02/08/2009
Tutorial 1: Topics in Statistical Machine Translation by Kevin Knight (ISI) and Philippe Koehn (Edinburgh Univ.)
Some sub-topics within this tutorial I need to be take into account are as follows:
- Minimum Bayesian Risk Decoding
- Re-evaluation of phrase-based SMT outputs
- MT system combination
- Efficient decoding (e.g. using cube pruning)
- Discriminative training with various features
I am looking for their slides (soft) for my further reference. If you have it, please share with me, thanks a lot!
Day 2 - 03/08/2009
Session 2B: Generation and Summarization 1
Talk 1: DEPEVAL (summ): Dependency-based Evaluation for Automatic Summaries by Karolina Owczarzakek
- the main idea is to use dependency relations to summary evaluation
- better in comparison with ROUGE (2004) and BE (2005)
+ Question: difference between DEPEVAL and BE?
Note:
- Lexical-Functional Grammar (e.g. two syntactic structures to one functional structure)
- LFG parser
+ Charniak-Johnson syntactic parser (2005)
+ LFG annotation (2008)
Talk 2: Summarizing Definition from Wikipedia by Shiren Ye
- raise new problem with their challenges in summarization of Wikipedia articles
+ recursive links
+ hidden information
- single-document summarization
- use existing approach named Document Concept Lattice (IPM 2007)
Talk 3: Automatically Generating Wikipedia Articles: A Structure-aware Approach by C. Sauper
- new problem in generating overview articles in Wikipedia using various resources crawled from the Internet
- template creation using clustering existing section topics in database
- proposed joint learning model that integrates Integer Linear Programming (ILP) into learning to optimize weights (for each section topic)
Note:
- evaluation of quality of generated articles is subjective (Prof. Hovy asked about this)!
Talk 4: Learning to tell tales: A Data-driven Approach to Story Generation by Neil McIntyre
- an interesting problem that results end-to-end generation system
- content selection (content) -> content planning (grammar) -> generation (use LM)
Note:
- how to evaluate the quality of generated stories in terms of coherence and interestingness?
Day 3 - 04/08/2009
Relax to save my energy to enjoy the interesting remaining sessions, especially in EMNLP!
Day 4 - 05/08/2009
Talk 1: SMS based Interface for FAQ Retrieval
- actually cannot follow the Indian guy who is speaker of this talk.
Talk 2: A Syntax-free Approach to Japanese Sentence Compression
- It is worthy noting some materials relevant to my current interest as follows:
+Intra-sentence positional term weighting
+Patched language modeling
- Analysis of human-made reference compression
-> very helpful to figure out challenges in specific problems!
- combinatorial optimization problem
-> used to do parameter optimization (MCE-Minimum Classification Error in this paper)
- Statistical significance using Wilcoxin sign T-test
Talk 3: Application-driven Statistical Paraphrase Generation
- use SMT-like techniques but propose some new models within noisy channel model
+ paraphrase model (adapt)
+ LM (re-use)
+ usability (propose)
- seems to be not compelling about error analysis (only exhibit the very good outputs of proposed system), and figure out which components in proposed framework are most influential?
Talk 4: Word or Phrase? Learning Which Unit to Stress for Information Retrieval
It seems to not interest me a lot, IR stuffs!
Talk 5: A Generative Blog Post Retrieval Model that Uses Query Expansion based on External Collections
- Learn about query expansion techniques and how to integrate it into specific problem (blog post retrieval in this paper)
+ worthy noting query expansion based on external resources
Talk 6: An Optimal-Time Binarization Algorithm for Linear Context-Free Rewriting Systems with Fan-Out Two
- a lot of parsing-relevant stuffs (especially in algorithm complexity) in this talks that made me extremely confused!
Talk 7: A Polynominal-Time Parsing Algorithm for TT-MCTAG
- cannot understand any materials!
Talk 8: Unsupervised Relation Extraction by Mining Wikipedia Texts Using Information from the Web
- quite simple idea (just in my opinion) based on observation from Wikipedia
- use dependency as key components
Closing session:
- very interesting and sometimes funny!
- Prof. Frederick Jelineks has been received the Lifetime Achievement Award and then following by interesting talks about his biography sketch.
- announcements about future NLP conferences (COLING'10, ACL'10, ACL'11, NAACL'10, IJCNLP'10, LREC'10)
- announcements about best paper awards (see more details in ACL-IJCNLP'09 website)
Exhausted but helpful day. Prepare for next days with ACL workshops and EMNLP sessions!
Day 5 - 06/08/2009
Talk 1 (invited talk): Query-focused Summarization Using Text-to-Text Generation: when Information Comes from Multilingual Sources by Kathleen R. McKeown
- This is the first time I have seen the face of Prof. Kathleen McKeown who was supervisor of my current supervisor (A/P Min-Yen KAN) hehe.
Some main points:
- typical approach for query-based summarization:
+ choose key sentences (word freq, position, clue words)
+ matches of query term against sentence terms
=> leads to
+ irrelevant sentences
+ sentences placed out of context -> misconceptions
<= - generate new sentences from selected phrases + fluent sentences -> disfluent sentences
+ edit references to people (focus mainly on names)
- remove irrelevant sentences using sentence simplification
+ project DARPA GALE
+ interactive question user input
- NIGHTINGALE
+ use Wikipedia to expand query
+ consider name translation in multilingual resources
+ better if operating over phrases
- GLARF parser from NYU
- long sentences -> shorter sentences using sentence simplification
- redundancy detection => pairwise similarity across all sentences to identify concepts
+ alignment of dependency parses --> hypergraph
+ BOW
- future research direction: text generation for QA
Talk 2: A Classification Algorithm for Predicting the Structure of Summaries
- Interesting motivated question: how to "paste" selected sentences during abstracting?
- abstracting
+ some of materials not present
+ be modeled by cut-and-paste operations (Mani, 01)
- use specific verbs (predicates), for example: present, conclude, include, ...
- language tools
+ GATE (POS, morpho)
+ SUPPLE parser
Talk 3: Entity Extraction via Ensemble Semantics
- web -> entities -> top-PMI (point-wise mutual information) entities
Talk 4: Clustering to Find Exemplar Terms for Keyphrase Extraction
- relatedness
+ co-occurrence (statistics)
+ Wikipedia-based (e.g. PMI)
Day 6 - 07/08/2009
TBA
Just for taking notes!
--
Cheers,
Vu
Tutorial 1: Topics in Statistical Machine Translation by Kevin Knight (ISI) and Philippe Koehn (Edinburgh Univ.)
Some sub-topics within this tutorial I need to be take into account are as follows:
- Minimum Bayesian Risk Decoding
- Re-evaluation of phrase-based SMT outputs
- MT system combination
- Efficient decoding (e.g. using cube pruning)
- Discriminative training with various features
I am looking for their slides (soft) for my further reference. If you have it, please share with me, thanks a lot!
Day 2 - 03/08/2009
Session 2B: Generation and Summarization 1
Talk 1: DEPEVAL (summ): Dependency-based Evaluation for Automatic Summaries by Karolina Owczarzakek
- the main idea is to use dependency relations to summary evaluation
- better in comparison with ROUGE (2004) and BE (2005)
+ Question: difference between DEPEVAL and BE?
Note:
- Lexical-Functional Grammar (e.g. two syntactic structures to one functional structure)
- LFG parser
+ Charniak-Johnson syntactic parser (2005)
+ LFG annotation (2008)
Talk 2: Summarizing Definition from Wikipedia by Shiren Ye
- raise new problem with their challenges in summarization of Wikipedia articles
+ recursive links
+ hidden information
- single-document summarization
- use existing approach named Document Concept Lattice (IPM 2007)
Talk 3: Automatically Generating Wikipedia Articles: A Structure-aware Approach by C. Sauper
- new problem in generating overview articles in Wikipedia using various resources crawled from the Internet
- template creation using clustering existing section topics in database
- proposed joint learning model that integrates Integer Linear Programming (ILP) into learning to optimize weights (for each section topic)
Note:
- evaluation of quality of generated articles is subjective (Prof. Hovy asked about this)!
Talk 4: Learning to tell tales: A Data-driven Approach to Story Generation by Neil McIntyre
- an interesting problem that results end-to-end generation system
- content selection (content) -> content planning (grammar) -> generation (use LM)
Note:
- how to evaluate the quality of generated stories in terms of coherence and interestingness?
Day 3 - 04/08/2009
Relax to save my energy to enjoy the interesting remaining sessions, especially in EMNLP!
Day 4 - 05/08/2009
Talk 1: SMS based Interface for FAQ Retrieval
- actually cannot follow the Indian guy who is speaker of this talk.
Talk 2: A Syntax-free Approach to Japanese Sentence Compression
- It is worthy noting some materials relevant to my current interest as follows:
+Intra-sentence positional term weighting
+Patched language modeling
- Analysis of human-made reference compression
-> very helpful to figure out challenges in specific problems!
- combinatorial optimization problem
-> used to do parameter optimization (MCE-Minimum Classification Error in this paper)
- Statistical significance using Wilcoxin sign T-test
Talk 3: Application-driven Statistical Paraphrase Generation
- use SMT-like techniques but propose some new models within noisy channel model
+ paraphrase model (adapt)
+ LM (re-use)
+ usability (propose)
- seems to be not compelling about error analysis (only exhibit the very good outputs of proposed system), and figure out which components in proposed framework are most influential?
Talk 4: Word or Phrase? Learning Which Unit to Stress for Information Retrieval
It seems to not interest me a lot, IR stuffs!
Talk 5: A Generative Blog Post Retrieval Model that Uses Query Expansion based on External Collections
- Learn about query expansion techniques and how to integrate it into specific problem (blog post retrieval in this paper)
+ worthy noting query expansion based on external resources
Talk 6: An Optimal-Time Binarization Algorithm for Linear Context-Free Rewriting Systems with Fan-Out Two
- a lot of parsing-relevant stuffs (especially in algorithm complexity) in this talks that made me extremely confused!
Talk 7: A Polynominal-Time Parsing Algorithm for TT-MCTAG
- cannot understand any materials!
Talk 8: Unsupervised Relation Extraction by Mining Wikipedia Texts Using Information from the Web
- quite simple idea (just in my opinion) based on observation from Wikipedia
- use dependency as key components
Closing session:
- very interesting and sometimes funny!
- Prof. Frederick Jelineks has been received the Lifetime Achievement Award and then following by interesting talks about his biography sketch.
- announcements about future NLP conferences (COLING'10, ACL'10, ACL'11, NAACL'10, IJCNLP'10, LREC'10)
- announcements about best paper awards (see more details in ACL-IJCNLP'09 website)
Exhausted but helpful day. Prepare for next days with ACL workshops and EMNLP sessions!
Day 5 - 06/08/2009
Talk 1 (invited talk): Query-focused Summarization Using Text-to-Text Generation: when Information Comes from Multilingual Sources by Kathleen R. McKeown
- This is the first time I have seen the face of Prof. Kathleen McKeown who was supervisor of my current supervisor (A/P Min-Yen KAN) hehe.
Some main points:
- typical approach for query-based summarization:
+ choose key sentences (word freq, position, clue words)
+ matches of query term against sentence terms
=> leads to
+ irrelevant sentences
+ sentences placed out of context -> misconceptions
<= - generate new sentences from selected phrases + fluent sentences -> disfluent sentences
+ edit references to people (focus mainly on names)
- remove irrelevant sentences using sentence simplification
+ project DARPA GALE
+ interactive question user input
- NIGHTINGALE
+ use Wikipedia to expand query
+ consider name translation in multilingual resources
+ better if operating over phrases
- GLARF parser from NYU
- long sentences -> shorter sentences using sentence simplification
- redundancy detection => pairwise similarity across all sentences to identify concepts
+ alignment of dependency parses --> hypergraph
+ BOW
- future research direction: text generation for QA
Talk 2: A Classification Algorithm for Predicting the Structure of Summaries
- Interesting motivated question: how to "paste" selected sentences during abstracting?
- abstracting
+ some of materials not present
+ be modeled by cut-and-paste operations (Mani, 01)
- use specific verbs (predicates), for example: present, conclude, include, ...
- language tools
+ GATE (POS, morpho)
+ SUPPLE parser
Talk 3: Entity Extraction via Ensemble Semantics
- web -> entities -> top-PMI (point-wise mutual information) entities
Talk 4: Clustering to Find Exemplar Terms for Keyphrase Extraction
- relatedness
+ co-occurrence (statistics)
+ Wikipedia-based (e.g. PMI)
Day 6 - 07/08/2009
TBA
Just for taking notes!
--
Cheers,
Vu
Thursday, 30 July 2009
JAVA stuffs
Of courses, a lot of available sites concerning this, the below is one of them:
http://www.java2s.com/
(to be updated!)
--
Cheers,
Vu
http://www.java2s.com/
(to be updated!)
--
Cheers,
Vu
Wednesday, 29 July 2009
Porter's Stemming Algorithm Online
http://maya.cs.depaul.edu/~classes/ds575/porter.html
Useful for quick reference to Porter Stemming Algorithm for English!
--
Cheers,
Vu
Useful for quick reference to Porter Stemming Algorithm for English!
--
Cheers,
Vu
Tuesday, 28 July 2009
Lucene stuffs
These days, I am trying to use Lucene for my own research purpose. I figure out here some stuffs that may be relevant and useful:
Lucene in general: http://lucene.apache.org
-> you can find out more detail on this site!
Lucene Index Toolbox (Luke): http://www.getopt.org/luke/
-> This tool is very helpful for us to deal with functionality of Lucene search engine. It supports index, document browsing, search, ... with graphical UI (cross-platform).
Great articles:
2) Lucene Analyzer, Tokenizer and TokenFilter: http://mext.at/?p=26
-> how to use analyzer, tokenizer, filter in Lucene
3) Lucene Indexing and Document Scoring: (googling with the keyword "lucene indexing and document scoring")
-> contains some basic concepts and definitions in Lucene under comprehensive explanation.
(to be continued ...)
--
Cheers,
Vu
Lucene in general: http://lucene.apache.org
-> you can find out more detail on this site!
Lucene Index Toolbox (Luke): http://www.getopt.org/luke/
-> This tool is very helpful for us to deal with functionality of Lucene search engine. It supports index, document browsing, search, ... with graphical UI (cross-platform).
Great articles:
1) Summarization with Lucene: http://sujitpal.blogspot.com/2009/02/summarization-with-lucene.html
-> In this article, the author tried to implement his own summarizer based mainly on two simple summarization algorithms, namely Classifier4J (C4J) and Open Text Summarizer (OTS) using Lucene, an open-source source engine API.2) Lucene Analyzer, Tokenizer and TokenFilter: http://mext.at/?p=26
-> how to use analyzer, tokenizer, filter in Lucene
3) Lucene Indexing and Document Scoring: (googling with the keyword "lucene indexing and document scoring")
-> contains some basic concepts and definitions in Lucene under comprehensive explanation.
4) Understanding Lucene Scoring: http://www.opensourcereleasefeed.com/article/show/understanding-lucene-scori
5) Lucene Query Syntax: http://lucene.apache.org/java/2_3_2/queryparsersyntax.html (replace the version "2_3_3" if you are using newer ones)
5) Lucene Query Syntax: http://lucene.apache.org/java/2_3_2/queryparsersyntax.html (replace the version "2_3_3" if you are using newer ones)
(to be continued ...)
--
Cheers,
Vu
Monday, 27 July 2009
AI softwares
Summarization: http://summarizer.intellexer.com/index.html
Extractor: http://www.extractor.com/
Surprising! AI softwares indeed!
--
Cheers,
Vu
Extractor: http://www.extractor.com/
Surprising! AI softwares indeed!
--
Cheers,
Vu
Keywords Co-Occurrence and Semantic Connectivity
http://www.miislita.com/semantics/c-index-1.html
I would like to adapt some techniques mentioned in this article to the problem of keyword co-occurrence in scientific domain (e.g. ACL Anthology).
--
Cheers,
Vu
I would like to adapt some techniques mentioned in this article to the problem of keyword co-occurrence in scientific domain (e.g. ACL Anthology).
--
Cheers,
Vu
Sunday, 26 July 2009
Brown Coherence Toolkit
Link for download: http://www.cs.brown.edu/~melsner/egrid-distr.tgz
Manual: http://www.cs.brown.edu/~melsner/manual.html
Link to the author: http://www.cs.brown.edu/~melsner/
--
Cheers,
Vu
Manual: http://www.cs.brown.edu/~melsner/manual.html
Link to the author: http://www.cs.brown.edu/~melsner/
--
Cheers,
Vu
iOPENER
http://tangra.si.umich.edu/clair/iopener/index.html
The idea automatically creating technical surveys using AI algorithms seems to be interesting but quite ambitious (according to my understanding). This is a inter-disciplinary research combining the various techniques in Natural Language Processing, Natural Language Understanding as well as Natural Language Generation. To some extent, it is really hard, still far away from present :D.
See more in the newest paper at NAACL'09 "Using Citations to Generate Surveys of Scientific Paradigms"! Initially, the authors use existing citation contexts of articles combining with state-of-the-arts techniques (e.g. Trimmer, LexRank, C-LexRank, C-RR) in extractive multi-document summarization (almost in news domain) to generate the surveys. They also concluded some important points as follows:
- approaches in other domains applied in the scientific extent can produce satisfactory results
- citation contexts and abstracts contain much more useful information for summaries than full texts in papers
My comments on this are as follows:
- the specific features of scientific survey articles are not used yet. For example: the structure of technical surveys, topic coherence, ...
- information fusion. Different citation contexts may contain overlapping information. How to pinpoint them?
--
Cheers,
Vu
The idea automatically creating technical surveys using AI algorithms seems to be interesting but quite ambitious (according to my understanding). This is a inter-disciplinary research combining the various techniques in Natural Language Processing, Natural Language Understanding as well as Natural Language Generation. To some extent, it is really hard, still far away from present :D.
See more in the newest paper at NAACL'09 "Using Citations to Generate Surveys of Scientific Paradigms"! Initially, the authors use existing citation contexts of articles combining with state-of-the-arts techniques (e.g. Trimmer, LexRank, C-LexRank, C-RR) in extractive multi-document summarization (almost in news domain) to generate the surveys. They also concluded some important points as follows:
- approaches in other domains applied in the scientific extent can produce satisfactory results
- citation contexts and abstracts contain much more useful information for summaries than full texts in papers
My comments on this are as follows:
- the specific features of scientific survey articles are not used yet. For example: the structure of technical surveys, topic coherence, ...
- information fusion. Different citation contexts may contain overlapping information. How to pinpoint them?
--
Cheers,
Vu
Clair library
The Clair Library - A Perl package for Natural Language Processing, Information Retrieval and Network Analysis.
Just a note for further reference!
--
Cheers,
Vu
Just a note for further reference!
--
Cheers,
Vu
Scholarship links
Sites to seek for scholarships for different levels (undergraduate, master, PhD):
1)
http://scholarship-position.blogspot.com/
2)
http://scholarshipsboard.com/
--
Cheers,
Vu
1)
http://scholarship-position.blogspot.com/
2)
http://scholarshipsboard.com/
--
Cheers,
Vu
Thursday, 23 July 2009
NLP research links by Vlado Keselj
http://users.cs.dal.ca/~vlado/nlp/
This link compiled by Prof. Vlado Keselj at Dalhousie University contains quite a lot of research links in NLP.
Just a note for future search!
--
Cheers,
Vu
This link compiled by Prof. Vlado Keselj at Dalhousie University contains quite a lot of research links in NLP.
Just a note for future search!
--
Cheers,
Vu
Wednesday, 22 July 2009
Language Experiments - The Portal for Psychological Experiments on Language
http://www.language-experiments.org/
This site may be very useful for people who want to create experiments for their own research. I am trying to figure out whether it can help me do something helpful.
--
Cheers,
Vu
This site may be very useful for people who want to create experiments for their own research. I am trying to figure out whether it can help me do something helpful.
--
Cheers,
Vu
Tuesday, 21 July 2009
PDF to raw texts
There are some ways to convert PDF files to raw text files. Two typical ways, just according to my opinion, are using and non-using OCR technology. PDFBox is a freely available tool non-using OCR technology, so the converted raw texts suffer some errors. To utilize OCR technology in conversion, we have some tricks as follows:
- use some free or commercial tools like SimpleOCR, VeryPDF, OmniPage ...
- copy-and-paste directly from PDF files. This trick is only applied to some PDF files that are not secured.
- use online tools (I am actually not sure about the internal technologies they are using :()
- leverage Google OCR (means Google will do this for us):
Cheers,
Vu
- use some free or commercial tools like SimpleOCR, VeryPDF, OmniPage ...
- copy-and-paste directly from PDF files. This trick is only applied to some PDF files that are not secured.
- use online tools (I am actually not sure about the internal technologies they are using :()
- leverage Google OCR (means Google will do this for us):
Convert Scanned PDFs to Text--
Now if you have bunch of scanned PDF files on your hard drive and no OCR software, here’s what you can do to convert them into recognizable text.
Create a folder in your website (say abc.com/pdf) and upload all the PDF images to that folder. Now create a public web page that links to all the PDF files. Wait for the Google bots to spider your stuff.
Once done, type the query "site:abc.com/pdf filetype:pdf" to see the PDF documents as HTML.
Cheers,
Vu
Machine Learning and Natural Language
http://l2r.cs.uiuc.edu/~danr/Teaching/CS546-09/lectures.html
This is a helpful course instructed by Prof. Dan Roth at UIUC. Perhaps I must spend more time to self-study some of materials from this course to improve my background in Machine Learning for Natural Language.
--
Cheers,
Vu
This is a helpful course instructed by Prof. Dan Roth at UIUC. Perhaps I must spend more time to self-study some of materials from this course to improve my background in Machine Learning for Natural Language.
--
Cheers,
Vu
Sunday, 19 July 2009
Term "Oracle"
Sometimes I encountered the term "Oracle" in some papers, especially in Experiment and Evaluation sections but I quite did not understand what it means. Recently, I have figured out the meaning of it, it can be something referring to the upper bound of any measure that is used to assess the performance of a method.
--
Cheers,
Vu
--
Cheers,
Vu
Baseline methods
How to effectively design the baseline methods for specific problems?
This is also my raised question when approaching any specific research problem. The baseline methods can be 1) state-of-the-art methods that well-studied in previous studies (yeh, we can use them by re-implementing some of them but how many are enough? It's hard question) 2) the simplest methods that we can think of naturally or the methods we easily propose but should not be too naive (because beating the methods which are naive will degrade the value of your proposed methods).
Just my thoughts, any corrections are welcome!
--
Cheers,
Vu
This is also my raised question when approaching any specific research problem. The baseline methods can be 1) state-of-the-art methods that well-studied in previous studies (yeh, we can use them by re-implementing some of them but how many are enough? It's hard question) 2) the simplest methods that we can think of naturally or the methods we easily propose but should not be too naive (because beating the methods which are naive will degrade the value of your proposed methods).
Just my thoughts, any corrections are welcome!
--
Cheers,
Vu
Wilcoxon signed-rank test
This is the test that is used to determine the statistical significance of the results probably from different methods.
Should refer to the following tool "Significance Testing" maintained by Sebastian Pado:
http://www.nlpado.de/~sebastian/sigf.html
Should refer to the following tool "Significance Testing" maintained by Sebastian Pado:
http://www.nlpado.de/~sebastian/sigf.html
--
Cheers,
Vu
New Book: Search User Interfaces
A book written by Prof. Marti A. Hearst at University of California, Berkeley
Search User Interfaces: http://searchuserinterfaces.com/
Just a note, maybe it will be useful for me in the future.
--
Cheers,
Vu
Search User Interfaces: http://searchuserinterfaces.com/
Just a note, maybe it will be useful for me in the future.
--
Cheers,
Vu
Friday, 17 July 2009
Intuition and Observation in NLP research
I think that intuition and observation are crucial factors that strongly affect the proposed methods/approaches solving the problems in NLP research. I read quite a lot of NLP papers and recognized this. Due to the ambiguity in natural language, some of NLP problems will be heuristically solved based on some intuition and observation from what humans are able to do naturally.
Anyway, just my opinion. It is quite subjective. Please contradict me if any!
--
Cheers,
Vu
Anyway, just my opinion. It is quite subjective. Please contradict me if any!
--
Cheers,
Vu
Wednesday, 15 July 2009
NLP conferences links
Useful links to update information about newest conferences/journals
NLP conference acceptance rates
Computer Science Conference Ranking
According to this list, some NLP related conferences are ranked as follows:
1) http://www.cs.rochester.edu/~tetreaul/conferences.html
2) http://www-tsujii.is.s.u-tokyo.ac.jp/~yoshinag/research/conference_link.html
2) http://www-tsujii.is.s.u-tokyo.ac.jp/~yoshinag/research/conference_link.html
NLP conference acceptance rates
Computer Science Conference Ranking
According to this list, some NLP related conferences are ranked as follows:
- AAAI: American Association for AI National Conference (0.99)
- IJCAI: Intl Joint Conf on AI (0.96)
- SIGIR: ACM SIGIR Conf on Information Retrieval (0.96)
- ACL: Annual Meeting of the ACL - Association of Computational Linguistics (0.90)
- NAACL: North American Chapter of the ACL (0.88)
- CoNLL: Conference on Natural Language Learning (0.82)
- EMNLP: Empirical Methods in Natural Language Processing (0.79)
- COLING: International Conference on Computational Linguistics (0.64)
- EACL: Annual Meeting of European Association Computational Linguistics (0.62)
- PACLIC: Pacific Asia Conference on Language, Information and Computation (0.56)
- RANLP: Recent Advances in Natural Language Processing (0.54)
- NLPRS: Natural Language Pacific Rim Symposium (0.54)
--
Cheers,
Vu
Cheers,
Vu
ACL-IJCNLP'09 proceeding online
This proceedings will be officially archiving at ACL Anthology. The below link is only temporary for who wants to quickly refer the newest articles of ACL conference.
http://nlp.csie.ncnu.edu.tw/%7Eshin/acl-ijcnlp2009/proceedings/CDROM/ACLIJCNLP/index.html
--
Cheers,
Vu
http://nlp.csie.ncnu.edu.tw/%7Eshin/acl-ijcnlp2009/proceedings/CDROM/ACLIJCNLP/index.html
--
Cheers,
LEDA
A tool supporting learning algorithms
http://www.cs.sunysb.edu/%7Ealgorith/implement/LEDA/implement.shtml
--
Cheers,
Vu
http://www.cs.sunysb.edu/%7Ealgorith/implement/LEDA/implement.shtml
--
Cheers,
Vu
Tuesday, 14 July 2009
NLP/Computational Linguistics Anthology
There are very useful resources that support for research in the field of Computational Linguistics and Natural Language Processing. Some of them are currently available on the web.
* ACL Anthology SearchBench: http://aclasb.dfki.de/
--
Cheers,
Vu
* ACL Anthology: http://www.aclweb.org/anthology-new/
- archive papers of major conferences or journals such as: ACL, NAACL, EMNLP, COLING, Journal of computational Linguistics, ...
* The ACL Anthology Network: http://belobog.si.umich.edu/clair/anthology/index.cgi
- very helpful network built based on data archived from ACL Anthology. It plays a role as a social network that unveils relationships between papers and authors.
* ACL Anthology Reference Corpus (ACL ARC): http://acl-arc.comp.nus.edu.sg/
- a corpus recently built by some leading researchers around the world aims at boosting the research in scientific domain.* ACL Anthology SearchBench: http://aclasb.dfki.de/
--
Cheers,
Vu
Labels:
anthology,
computational linguistics,
NLP,
research
Monday, 13 July 2009
The Machine Learning Forum
http://seed.ucsd.edu/joomla15/
I think this is a great forum for anyone who wants to learn, employ and apply some machine learning techniques to solve research problems in specific domain.
--
Cheers,
Vu
I think this is a great forum for anyone who wants to learn, employ and apply some machine learning techniques to solve research problems in specific domain.
--
Cheers,
Vu
Linux Ubuntu stuffs
Some required configuration steps (of course, just appropriate in my situation):
1) Sharing folders between Windows XP (host) and Ubuntu Linux (guest) installed using VMware
on Linux
- create arbitrary folder to be shared
- install samba, can be automatically installed using wizards by right clicking the shared folder and choosing tab "Share". The Linux OS will ask for this installing progress. Then, just follow it :D.
- use the command # ifconfig | grep "inet addr:" to see your IP address of Ubuntu Linux (guest).
on Windows
- open "My Computer", use the tab "Tools\Map Network Drive", see the following figure:
1) Sharing folders between Windows XP (host) and Ubuntu Linux (guest) installed using VMware
on Linux
- create arbitrary folder to be shared
- install samba, can be automatically installed using wizards by right clicking the shared folder and choosing tab "Share". The Linux OS will ask for this installing progress. Then, just follow it :D.
- use the command # ifconfig | grep "inet addr:" to see your IP address of Ubuntu Linux (guest).
on Windows
- open "My Computer", use the tab "Tools\Map Network Drive", see the following figure:
+ choose the drive on Windows which will be mapped to the one on Ubuntu Linux
+ choose the address by clicking "button Browser" and then selecting the appropriate drive address on Ubuntu Linux
- setup javadocs for netbeans::
+ download JDK javadocs from sun website (of course choose appropriate versions with current JDK)
+ in netbeans IDE, choose menu Tools\Netbeans Platforms\javadoc and then locate the downloaded javadoc file
7) size of hard drives
- use the command # df -h
8) install eclipse
- sudo apt-get install eclipse
9) gnome commander - looks like Total Commander on Windows
http://www.nongnu.org/gcmd/ or sudo apt-get install gnome-commander
10) correct CGI/Perl bad interpreter - very useful tip
- Link
- use the command # perl -i.bak -pe 'tr/\r//d' script_file (e.g. *.pl, *.sh)
11) install JDK/JRE on Ubuntu Linux and related configuration
- Link
Some useful links:
- IDEs for Developers: http://mashable.com/2007/11/17/ide-toolbox/
- Eclipse IDE: http://www.eclipse.org/
- EPIC (Eclipse Perl Integration): http://www.epic-ide.org/
- Anjuta IDE: http://projects.gnome.org/anjuta/index.shtml
- IDEs for Developers: http://mashable.com/2007/11/17/ide-toolbox/
- EPIC (Eclipse Perl Integration): http://www.epic-ide.org/
- Anjuta IDE: http://projects.gnome.org/anjuta/index.shtml
Friday, 10 July 2009
Soft skills for scientific research
Very useful writings (just in Vietnamese):
I think that I have been experiencing similar circumstances during my research life though this is just a beginning ^_^. As shown in two above writings, we should be diligent, patient, and regular to maintain our passion for scientific research. Hopefully, I will overcome all of the most difficulties to realize my dream (become a well-skilled research scientist ^_^, still far away from present).
--
Cheers,
Vu
1) http://tuanvannguyen.blogspot.com/2009/01/k-nng-mm-cho-nh-khoa-hc.html
2) http://groups.google.com/group/cvpr-hcmuns-vn/msg/13fe6e2c525e550a?
2) http://groups.google.com/group/cvpr-hcmuns-vn/msg/13fe6e2c525e550a?
I think that I have been experiencing similar circumstances during my research life though this is just a beginning ^_^. As shown in two above writings, we should be diligent, patient, and regular to maintain our passion for scientific research. Hopefully, I will overcome all of the most difficulties to realize my dream (become a well-skilled research scientist ^_^, still far away from present).
--
Cheers,
Vu
Thursday, 9 July 2009
Human computation
Manual data annotation methods need much efforts in terms of time-consuming, labor-intensive and error-prone process. Recently, human computation has emerged as a viable synergy for data annotation of which idea is to harness what humans are good at but machines are poor at. Currently, many tasks are trivial for humans but continue to challenge even the most sophisticated computer programs. Thus, the intelligent combination between computers and humans in terms of human computation to solve complex tasks is becoming a promising approach. Two typical frameworks representing human computation for data annotation are Games With A Purpose (GWAP) and Amazon Mechanical Turk (AMT).
During my earlier research at SoC@NUS, I had a chance to undertake an analysis survey on human computation. I will post it on here as soon as possible for your quick reference.
I think that human computation can become a primary research tool for quickly creating evaluation data in the future.
--
Cheers,
Vu
During my earlier research at SoC@NUS, I had a chance to undertake an analysis survey on human computation. I will post it on here as soon as possible for your quick reference.
I think that human computation can become a primary research tool for quickly creating evaluation data in the future.
--
Cheers,
Vu
Tuesday, 7 July 2009
How to read a scientific article?
Do you think you already read scientific articles in terms of effectiveness and efficiency? If not, you can read some of the following articles:
www.owlnet.rice.edu/~cainproj/courses/sci_article.pdf
http://www.lib.purdue.edu/phys/assets/SciPaperTutorial.swf (very nice presentation :D)
For a newbie in research like me, it is extremely important to learn about.
--
Cheers,
Vu
www.owlnet.rice.edu/~cainproj/courses/sci_article.pdf
http://www.lib.purdue.edu/phys/assets/SciPaperTutorial.swf (very nice presentation :D)
For a newbie in research like me, it is extremely important to learn about.
--
Cheers,
Vu
Hypotheses
I have learned one lesson from my adviser.
Sometimes you are thinking about the problem and want to figure out the solution for it. The best way to do this is that you should firstly think of some hypotheses for your problem theoretically. You will then validate your hypotheses empirically based on some experiments. Explain and analyze why the results look like. It is worthy noticing that sometimes your data may not support for your hypotheses.
Please do not do experiments only, it is not helpful.
Hopefully, it helps me a lot then. Keep up with my best effort in my research.
--
Cheers,
Vu
Sometimes you are thinking about the problem and want to figure out the solution for it. The best way to do this is that you should firstly think of some hypotheses for your problem theoretically. You will then validate your hypotheses empirically based on some experiments. Explain and analyze why the results look like. It is worthy noticing that sometimes your data may not support for your hypotheses.
Please do not do experiments only, it is not helpful.
Hopefully, it helps me a lot then. Keep up with my best effort in my research.
--
Cheers,
Vu
Tools supporting our brainstorming
FreeMind
Link:
http://freemind.sourceforge.net/wiki/index.php/Main_Page
Tutorial
http://freemind.sourceforge.net/wiki/index.php/Tutorial_effort
FreeMind in YouTube
http://www.youtube.com/watch?v=grut_2cardM
Vietnamese book
http://www.vinabook.com/lap-ban-do-tu-duy-cong-cu-tu-duy-toi-uu-se-lam-thay-doi-cuoc-song-cua-ban-m11i21657.html
Thanks Prof. Duy-Dinh LE for sharing this information.
--
Cheers,
Vu
Link:
http://freemind.sourceforge.net/wiki/index.php/Main_Page
Tutorial
http://freemind.sourceforge.net/wiki/index.php/Tutorial_effort
FreeMind in YouTube
http://www.youtube.com/watch?v=grut_2cardM
Vietnamese book
http://www.vinabook.com/lap-ban-do-tu-duy-cong-cu-tu-duy-toi-uu-se-lam-thay-doi-cuoc-song-cua-ban-m11i21657.html
Thanks Prof. Duy-Dinh LE for sharing this information.
--
Cheers,
Vu
Topic modeling toolkit
MALLET: http://mallet.cs.umass.edu/
A famous tool supporting various algorithms in exploring latent topics in raw texts.
--
Cheers,
Vu
A famous tool supporting various algorithms in exploring latent topics in raw texts.
--
Cheers,
Vu
Summarization toolkit
MEAD: http://www.summarization.com/mead developed by Dragomir Radev at Univ. of Michigan
- Using Centroid-based summarization algorithm, read more details in some related papers
- Some troubles will be encountered when installing this tool. Please read the README file carefully before using it.
- Useful FAQs relating to MEAD: http://www.summarization.com/~radev/mead/email/
ROUGE evaluation: http://www.isi.edu/licensed-sw/see/rouge/index.html
--
Cheers,
Vu
- Using Centroid-based summarization algorithm, read more details in some related papers
- Some troubles will be encountered when installing this tool. Please read the README file carefully before using it.
- Useful FAQs relating to MEAD: http://www.summarization.com/~radev/mead/email/
ROUGE evaluation: http://www.isi.edu/licensed-sw/see/rouge/index.html
--
Cheers,
Vu
Machine Learning tools
WEKA: http://www.cs.waikato.ac.nz/~ml/index.html
Various machine learning algorithms like SVM, Bayes, decision tools, ... are integrated in this tool. It also supports visualization of machine learning data, very efficient and effective for observation and analysis.
--
Cheers,
Vu
Various machine learning algorithms like SVM, Bayes, decision tools, ... are integrated in this tool. It also supports visualization of machine learning data, very efficient and effective for observation and analysis.
--
Cheers,
Vu
Reference Manager Tools (part 1)
There are currently many available tools that support the process of managing references in scientific research. I would like to introduce some of them which are quite good according to my opinion, as follows:
JabRef: http://sourceforge.net/projects/jabref/. The JabRef output can be flexibly customized according to various file formats (e.g. HTML, PDF, ...), see the following figure for a typical example (thanks Zhao Shanheng for sharing this template file):
Zotero (Firefox addin): http://www.zotero.org/
ForeCiteNote: http://forecitenote.comp.nus.edu.sg . This is one of exciting projects undertaken by WING research group at SoC@NUS. Try it!
--
Cheers,
Vu
JabRef: http://sourceforge.net/projects/jabref/. The JabRef output can be flexibly customized according to various file formats (e.g. HTML, PDF, ...), see the following figure for a typical example (thanks Zhao Shanheng for sharing this template file):
Zotero (Firefox addin): http://www.zotero.org/
ForeCiteNote: http://forecitenote.comp.nus.edu.sg . This is one of exciting projects undertaken by WING research group at SoC@NUS. Try it!
--
Cheers,
Vu
upcoming NLP conferences in 2010
- ACL 2010 (http://acl2010.org/) [rank 1]
- NAACL 2010 (http://naaclhlt2010.isi.edu/) [rank 1]
- EMNLP 2010 [rank 2]
- COLING 2010 (http://www.coling-2010.org/) [rank 2]
- CICLing 2010 [rank 3]
- AI related (NLP tracks):
+ AAAI'10 (http://www.aaai.org/Conferences/AAAI/aaai10.php) [rank 1]
+ ECAI'10 (http://ecai2010.appia.pt/) [rank 2]
+ PRICAI'10 (http://www.pricai2010.org/) [rank 3]
+ ICTAI'10 [rank 2]
My preference is AAAI>EMNLP/COLING/ECAI>PRICAI/CICLING.
Note that I will update their deadlines of paper submission as soon as possible :D.
--
Cheers,
Vu
- NAACL 2010 (http://naaclhlt2010.isi.edu/) [rank 1]
- EMNLP 2010 [rank 2]
- COLING 2010 (http://www.coling-2010.org/) [rank 2]
- CICLing 2010 [rank 3]
- AI related (NLP tracks):
+ AAAI'10 (http://www.aaai.org/Conferences/AAAI/aaai10.php) [rank 1]
+ ECAI'10 (http://ecai2010.appia.pt/) [rank 2]
+ PRICAI'10 (http://www.pricai2010.org/) [rank 3]
+ ICTAI'10 [rank 2]
My preference is AAAI>EMNLP/COLING/ECAI>PRICAI/CICLING.
Note that I will update their deadlines of paper submission as soon as possible :D.
--
Cheers,
Vu
LaTeX and its related issues
Templates for LaTeX for beginners:
ACM templates: http://www.acm.org/sigs/ publications/proceedings- templates
LaTeX editors:
TeXnicCenter: http://www.texniccenter.org/
Texmaker: http://www.xm1math.net/texmaker/ (cross-platform)
TeXstudio: http://texstudio.sourceforge.net/ (cross-platform)
PSTricks:
PSTricks: http://en.wikipedia.org/wiki/PSTricks
LaTeXDraw: http://latexdraw.sourceforge.net/
(support automatic generation of LaTeX codes, very effective)
Tools supporting vector graphics:
Inkscape: http://www.inkscape.org
GIMP: http://www.gimp.org/
LaTeX tutorials:
http://www.stat.cmu.edu/~hseltman/LatexTips.html
http://www.artofproblemsolving.com/LaTeX/AoPS_L_About.php
http://en.wikibooks.org/wiki/LaTeX
http://dcwww.fys.dtu.dk/~schiotz/comp/LatexTips/LatexTips.html
http://www.maths.tcd.ie/~dwilkins/LaTeXPrimer/Index.html
http://texblog.wordpress.com/: a great LaTeX site
LaTeX resources:
http://www.eng.cam.ac.uk/help/tpl/textprocessing/
LaTeX community:http://www.latex-community.org/
LaTeX tips (ubiquitous):
(Source: http://www.maths.tcd.ie/~dwilkins/LaTeXPrimer/QuotDash.html)
(to be continued)
2) LaTeX mathematical equation editor: http://www.codecogs.com/components/eqneditor/editor.php -> an interactivetool , very cool!
3) Some strategies to include graphics in LaTeX documents
http://www.tug.org/TUGboat/Articles/tb26-1/hoeppner.pdf
4) Using Visio to create EPS files (very helpful)
http://www.win.tue.nl/latex/visioeps.html
http://www.adobe.com/support/downloads/thankyou.jsp?ftpID=1500&fileID=1438 (driver of EPS files for printer in Visio)
5) Rename "Contents" by "Tables of Contents" using
\renewcommand\contentsname{Table of Contents}
6) Special characters in LaTeX:
http://www.noao.edu/noaoprop/help/symbols/
7) Spell Checking for LaTeX documents
http://www.microspell.com/cgi-bin/spellform.pl
8) Footnote with caption
http://www.latex-community.org/forum/viewtopic.php?f=5&t=1078
9) LaTeX tables with bar charts
http://www.keithv.com/software/barchart/
ACM templates: http://www.acm.org/sigs/
LaTeX editors:
TeXnicCenter: http://www.texniccenter.org/
Texmaker: http://www.xm1math.net/texmaker/ (cross-platform)
TeXstudio: http://texstudio.sourceforge.net/ (cross-platform)
PSTricks:
PSTricks: http://en.wikipedia.org/wiki/PSTricks
LaTeXDraw: http://latexdraw.sourceforge.net/
(support automatic generation of LaTeX codes, very effective)
Tools supporting vector graphics:
Inkscape: http://www.inkscape.org
GIMP: http://www.gimp.org/
LaTeX tutorials:
http://www.stat.cmu.edu/~hseltman/LatexTips.html
http://www.artofproblemsolving.com/LaTeX/AoPS_L_About.php
http://en.wikibooks.org/wiki/LaTeX
http://dcwww.fys.dtu.dk/~schiotz/comp/LatexTips/LatexTips.html
http://www.maths.tcd.ie/~dwilkins/LaTeXPrimer/Index.html
http://texblog.wordpress.com/: a great LaTeX site
LaTeX resources:
http://www.eng.cam.ac.uk/help/tpl/textprocessing/
LaTeX community:http://www.latex-community.org/
LaTeX tips (ubiquitous):
1) Quotation Marks and Dashes
Single quotation marks are produced in LaTeX using ` and '. Double quotation marks are produced by typing `` and ''. (The `undirected double quote character " produces double right quotation marks: it should never be used where left quotation marks are required.)
LaTeX allows you to produce dashes of various length, known as `hyphens', `en-dashes' and `em-dashes'. Hyphens are obtained in LaTeX by typing -, en-dashes by typing -- and em-dashes by typing ---.
One normally uses en-dashes when specifying a range of numbers. Thus for example, to specify a range of page numbers, one would type
on pages 155--219.
Dashes used for punctuating are often typeset as em-dashes, especially in older books. These are obtained by typing ---.
Single quotation marks are produced in LaTeX using ` and '. Double quotation marks are produced by typing `` and ''. (The `undirected double quote character " produces double right quotation marks: it should never be used where left quotation marks are required.)
LaTeX allows you to produce dashes of various length, known as `hyphens', `en-dashes' and `em-dashes'. Hyphens are obtained in LaTeX by typing -, en-dashes by typing -- and em-dashes by typing ---.
One normally uses en-dashes when specifying a range of numbers. Thus for example, to specify a range of page numbers, one would type
on pages 155--219.
Dashes used for punctuating are often typeset as em-dashes, especially in older books. These are obtained by typing ---.
(Source: http://www.maths.tcd.ie/~dwilkins/LaTeXPrimer/QuotDash.html)
(to be continued)
2) LaTeX mathematical equation editor: http://www.codecogs.com/components/eqneditor/editor.php -> an interactive
3) Some strategies to include graphics in LaTeX documents
http://www.tug.org/TUGboat/Articles/tb26-1/hoeppner.pdf
4) Using Visio to create EPS files (very helpful)
http://www.win.tue.nl/latex/visioeps.html
http://www.adobe.com/support/downloads/thankyou.jsp?ftpID=1500&fileID=1438 (driver of EPS files for printer in Visio)
5) Rename "Contents" by "Tables of Contents" using
\renewcommand\contentsname{Table of Contents}
6) Special characters in LaTeX:
http://www.noao.edu/noaoprop/help/symbols/
7) Spell Checking for LaTeX documents
http://www.microspell.com/cgi-bin/spellform.pl
8) Footnote with caption
http://www.latex-community.org/forum/viewtopic.php?f=5&t=1078
9) LaTeX tables with bar charts
http://www.keithv.com/software/barchart/
10) LaTeX mathematics equations tips: http://moser.cm.nctu.edu.tw/docs/typeset_equations.pdf
11) ... (to be added)
11) ... (to be added)
Subscribe to:
Posts (Atom)