| |||||||||||||
WMT 2011 : SIXTH WORKSHOP ON STATISTICAL MACHINE TRANSLATION | |||||||||||||
Link: http://www.statmt.org/wmt11/ | |||||||||||||
| |||||||||||||
Call For Papers | |||||||||||||
This workshop builds on five previous workshops on statistical machine translation:
the NAACL-2006 Workshop on Statistical Machine Translation, the ACL-2007 Workshop on Statistical Machine Translation, the ACL-2008 Workshop on Statistical Machine Translation, and the EACL-2009 Workshop on Statistical Machine Translation. the ACL-2010 Workshop on Statistical Machine Translation. This year's workshop will feature three shared tasks: a shared translation task, a system combination shared task, and a shared evaluation task to test automatic evaluation metrics. The shared translation task will include a featured task this year: translating disaster response SMS messages from Haitian Creole to English. The goal is to delve into the scientific challenges of producing machine translation systems useful enough to help first responders translate messages sent in the aftermath of disasters like the earthquake that struck Haiti in January of 2010. Low-resource languages and nosiy/informal input texts are major challenges for statistical machine translation. In addition to the shared tasks, the workshop will also feature scientific papers on topics related to MT. Topics of interest include, but are not limited to: word-based, phrase-based, syntax-based SMT using comparable corpora for SMT incorporating linguistic information into SMT decoding system combination error analysis manual and automatic method for evaluating MT scaling MT to very large data sets We encourage authors to evaluate their approaches to the above topics using the common data sets created for the shared tasks. TRANSLATION TASK The first shared task which will examine translation between the following language pairs: English-German and German-English English-French and French-English English-Spanish and Spanish-English English-Czech and Czech-English Haitian Creole to English Participants may submit translations for any or all of the language directions. In addition to the common test sets the workshop organizers will provide optional training resources, including a newly expanded release of the Europarl corpora and out-of-domain corpora. All participants who submit entries will have their translations evaluated. We will evaluate translation performance by human judgment. To facilitate the human evaluation we will require participants in the shared tasks to manually judge some of the submitted translations. We also provide baseline machine translation systems, with performance comparable to the best systems from last year's shared task. SYSTEM COMBINATION TASK Participants in the system combination task will be provided with the 1-best translations from each of the systems entered in the shared translation task. We will endeavor to provide a held-out development set for system combination, which will include translations from each of the systems and a reference translation. Any system combination strategy is acceptable, whether it selects the best translation on a per sentence basis or create novel translations by combining the systems' translations. The quality of the system combinations will be judged alongside the individual systems during the manual evaluation, as well as scored with automatic evaluation metrics. EVALUATION TASK The evaluation task will assess automatic evaluation metrics' ability to: Rank systems on their overall performance on the test set Rank systems on a sentence by sentence level Participants in the shared evaluation task will use their automatic evaluation metrics to score the output from the translation task and the system combination task. They will be provided with the output from the other two shared tasks along with reference translations. We will measure the correlation of automatic evaluation metrics with the human judgments. PAPER SUBMISSION INFORMATION Submissions will consist of regular full papers of 6-10 pages, plus additional pages for references, formatted following the EMNLP 2011 guidelines. In addition, shared task participants will be invited to submit short papers (4-6 pages) describing their systems or their evaluation metrics. Both submission and review processes will be handled electronically. We encourage individuals who are submitting research papers to evaluate their approaches using the training resources provided by this workshop and past workshops, so that their experiments can be repeated by others using these publicly available corpora. IMPORTANT DATES Release of training data January 24, 2011 Test set distributed for translation task March 14, 2011 Submission deadline for translation task March 20, 2011 Translations released for system combination March 25, 2011 System combination deadline April 1, 2011 Start of manual evaluation period April 1, 2011 End of manual evaluation May 31, 2011 Paper submission deadline May 19, 2011 Notification of acceptance June 17, 2011 Camera-ready deadline July 1, 2011 Papers available online July 23, 2011 Workshop in Edinburgh following EMNLP July 30-31, 2011 |
|