posted by user: grupocole || 2999 views || tracked by 7 users: [display]

LPPMT 2009 : Linguistic pre-processing for MT

FacebookTwitterLinkedInGoogle

Link: http://summitxii.amtaweb.org/summitxii-cfp-ws6.html
 
When Aug 30, 2009 - Aug 30, 2009
Where Ottawa, Ontario, Canada
Submission Deadline May 8, 2009
Notification Due Jun 12, 2009
Final Version Due Jul 10, 2009
Categories    NLP   linguistics
 

Call For Papers

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* FIRST CALL FOR PAPERS *



Workshop on

Linguistic pre-processing for MT



August 30, 2009

Machine Translation Summit XII

Ottawa, Ontario, Canada



We invite proposals for presentation at the Workshop on Linguistic pre-processing for MT, being held in conjunction with MT Summit XII.



WORKSHOP DESCRIPTION

Input for MT varies significantly in terms of spelling, terminology, word order phenomena, dialects, and sentence types, even within the same language. With user-generated content, this variability increases enormously. MT systems, and NLP systems generally, cannot cover effectively all of this variability -- usually because they are built to deal with professionally written technical or journalistic texts. Robust and reliable systems for mapping highly variable, uncontrolled writing into more consistent, tractable, "controlled" sentences will improve MT, search, and other NLP tasks. Current approaches to this problem include manually pre-editing the input texts -- as discussed for example in the series of CLAW workshops -- and/or expanding the coverage of MT systems.



One alternative approach is to pre-process or normalize the input automatically before MT. Translation of subtitles for television (Flanagan, 2006), non-fluent speech, low-quality OCR, and non-standard writing from limited-proficiency writers are only some of the application scenarios that require automatic linguistic pre-processing to improve MT output. For example, Callison-Burch (2007) showed that substitution of lexical paraphrases improved MT output. Xu & Seneff (2008) and Collins, Koehn & Kucerova (2005) re-arranged word order to improve performance of a statistical MT system. Yet another alternative approach is to produce a linguistically "enriched" input, in the form of lattices, trees, markup, etc. and allow for final interpretation later in the translation pipeline and/or with a direct feedback capability to force emergent behavior. Some approaches may even call into question the need for a strict, linear processing pipeline and may employ adaptive, iterative, or self-learning methods.



Common to all of these alternatives is the strategy of deploying significant linguistic and non-linguistic knowledge before translation itself occurs. This raises many questions about which kinds of knowledge have the biggest impact on translation, which can be automated most reliably and robustly, and which are most cost effective and scalable.



This workshop aims to compare and contrast some of the various techniques and approaches to these kinds of linguistic pre-processing for MT. The workshop will consist of a set of papers that will be selected by peer review.



IMPORTANT DATES



Paper submission deadline: May 8, 2009

Notification of acceptance: June 12, 2009

Camera ready submissions: July 10, 2009



WORKSHOP TOPICS



We welcome submissions about the main theme of this workshop. Specific topics include but are not limited to:

* Paraphrase generation

* Syntactic reordering

* Lexical / Terminological substitution

* Error detection and automatic correction

* Processing user-generated content

* Monolingual MT

* Confidence scoring

* Self-learning and adaptability



SUBMISSION REQUIREMENTS



Papers should not have been presented somewhere else or be under consideration for publication elsewhere, and should not identify the author(s). They should emphasize completed work rather than intended work. Each paper will be anonymously reviewed by the program committee.



Papers must be submitted in PDF format to mike [at] mikedillinger [dot] com by midnight of the due date. Submissions should be in English. The papers should be attached to an email indicating contact information for the author(s) and paper?s title. Papers should not exceed 8 pages including references and tables, and should follow the formatting guidelines posted at the MT Summit web site.



CONTACT INFORMATION



For further information, contact the organizing committee at mike [at] mikedillinger [dot] com



ORGANIZING COMMITTEE



Mike Dillinger, Translation Optimization Partners (Primary Contact)



PROGRAM COMMITTEE

* Alon Lavie (CMU)

* Farzad Ehsani (Fluential Inc)

* Hassan Sawaf (Apptek)

* Jörg Schütz (Bioloom Group)

* Philipp Koehn (U Edinburgh)



Related Resources

CMVIT-Maldives 2025   2025 9th International Conference on Machine Vision and Information Technology (CMVIT 2025)
IEEE Big Data - MMAI 2024   IEEE Big Data 2024 Workshop on Multimodal AI
Ei/Scopus-ACAI 2024   2024 7th International Conference on Algorithms, Computing and Artificial Intelligence(ACAI 2024)
DEPLING 2023   International Conference on Dependency Linguistics
ICMIP 2025   ACM--2025 10th International Conference on Multimedia and Image Processing (ICMIP 2025)
ITCAU 2024   2nd International Conference on Information Technology, Control and Automation
TAL-ALD 2024   Special issue of the journal Traitement Automatique des Langues (TAL) Abusive Language Detection : Linguistic Resources, Methods and Applications
ICCSEA 2024   14th International Conference on Computer Science, Engineering and Applications
Ei/Scopus-CDIVP 2025   2025 5th International Conference on Digital Image and Video Processing (CDIVP 2025)
SPIE-Ei/Scopus-DMNLP 2025   2025 2nd International Conference on Data Mining and Natural Language Processing (DMNLP 2025)-EI Compendex&Scopus