posted by user: glabaka || 5811 views || tracked by 11 users: [display]

LIHMT 2011 : Using Linguistic Information for Hybrid Machine Translation


When Nov 18, 2011 - Nov 18, 2011
Where Barcelona, Spain
Submission Deadline Sep 16, 2011
Notification Due Oct 7, 2011
Final Version Due Oct 21, 2011
Categories    machine translation   hybrid mt   mt evaluation

Call For Papers



Workshop on Using Linguistic Information for Hybrid Machine Translation

Friday, November 18, 2011. Barcelona, Spain.

In conjunction with Shared Task on Applying Machine Learning techniques
to optimising the division of labour in Hybrid MT (ML4HMT-2011)

Following on the OpenMT Workshop on Mixing Approaches to Machine
Translation in 2008 (, the aim of this
OpenMT-2 Workshop on Using Linguistic Information for Hybrid Machine
Translation (HMT) is to promote corpus-based methods and technologies
that combine resources and algorithms from the three general approaches
to MT: rule-based (RBMT), example-based (EBMT) and statistical (SMT).

The boundaries between these three approaches have narrowed:
(i) string based SMT models are being augmented with morphological,
syntactic or semantic information,
(ii) RBMT systems are using parallel corpora to improve results by
enriching their lexicons and grammars and applying new methods for
(iii) research has shown that benefits can be accrued by combining of MT
systems based on different MT approaches.

At the same time, data-driven Machine Translation (EBMT and SMT) is
nowadays prevalent within the MT research community and translation
results obtained using these approaches have now reached a reasonably
useful level of quality, especially when the target language is English.
But such data-driven MT systems base their knowledge on bilingual
aligned corpora, and the accuracy of their output depends heavily on the
quality and the size of that corpora. Large and reliable bilingual
corpora are unavailable for many language pairs. In addition,
translating into morphologically rich target languages makes the
training of data-driven systems rather more difficult.

Workshop Programme
The one-day workshop is being organised as part of the dissemination
effort of the OpenMT-2 project, a Spanish government funded, three-year,
multisite research effort addressing, on the one hand, approaches to
integrating structural information (morphological, syntactic and
semantic) into open-source SMT and, on the other, to developing novel
automatic MT evaluation using linguistically motivated metrics. Thus,
the central issues to be addressed during the workshop include:
methods and techniques for integrating structural information (syntactic
and semantic) into HMT,
methods and techniques for handling morphologically rich languages (e.g.
Basque) within HMT,
alternative approaches to automatic MT evaluation which rely on
linguistic criteria.

The programme will include three invited plenary talks, each addressing
one of the central issues above, and the presentation of a number of
refereed contributions on related topics. The invited speakers include:
Ondřej Bojar (Charles University, Czech Republic)
Topic: Treatment of morphologically rich languages for HMT,
Alon Lavie (Carnegie Mellon University, USA)
Topic: Integrating structural information into HMT,
Lucia Specia (University of Wolverhampton, UK)
Linguistic Indicators for Quality Estimation of Machine Translation.

The workshop will conclude with a brief panel discussion summarising the
results of the presentations as they impact the central issues.

Topics of Interest
We are particularly interested in papers describing research and
development in the following areas:
methods to compare and combine translation-outputs obtained from
different MT systems,
methods for dealing with languages with rich morphology within
data-driven approaches,
approaches to developing morphologically, syntactically or semantically
augmented SMT models,
new automatic (or manual) MT evaluation methods based on linguistically
motivated metrics,
descriptions of open-source or free language resources that are
available for developing hybrid MT systems.

All contributions will be published in the workshop proceedings.

Important Dates
Paper submission deadline: Sept. 16, 2011,
Notification of acceptance: Oct. 7, 2011,
Final version of paper: Oct 21, 2011,
Workshop: Nov 18, 2011.

Papers should be in English and up to a maximum of 8 pages long. Please
follow the ACL HLT 2011 formatting requirements for long papers found at:

To submit contributions, please follow the instructions at the EasyChair
conference management system submission website at:

The deadline for submission is September 9, 2011.
The contributions will undergo a double-blind review by three members of
the programme committee.

Please address queries to

Programme committee

Co-Chair: David Farwell (Technical University of Catalonia, TALP, Barcelona)
Co-Chair: Gorka Labaka (University of the Basque Country, Donostia)

Iñaki Alegria (University of the Basque Country, Donostia)
Ondřej Bojar (Charles University, Czech Republic)
Arantza Díaz de Ilarraza (University of the Basque Country, Donostia)
Chris Dyer (Carnegie Mellon University, US)
Cristina España (Technical University of Catalonia, TALP, Barcelona)
Marcello Federico (Fondazione Bruno Kessler, Italy)
Mikel Forcada (University of Alacant, Alicante)
Adrià de Gispert (University of Cambridge, UK)
Kevin Knight (Information Sciences Institute, US)
Phillip Koehn (University of Edinburgh, UK)
José Mariño (Technical University of Catalonia, TALP, Barcelona)
Lluís Màrquez (Technical University of Catalonia, TALP, Barcelona)
Hermann Ney (RWTH-Aachen, Germany)
Daniele Pighin (Technical University of Catalonia, TALP, Barcelona)
Aarne Ranta (Chalmers University of Technology, Gothenburg, Sweden)
Marta R. Costa-jussà (Barcelona Media, Spain)
Felipe Sánchez-Martínez (University of Alacant, Alicante)
Kepa Sarasola (University of the Basque Country, Donostia)
Lucia Specia (University of Wolverhampton, UK)
Dekai Wu (Hong Kong University of Science and Technology, China)

Local organization
Centre for Speech and Language Applications and Technologies (TALP),
Technical University of Catalonia (UPC).
Committee members: David Farwell (Chair), Amarin Deemagarn, Cristina
España, Meritxell González, Lluís Màrquez, Daniele Pighin.

Co-located Shared Task
Co-located to LIHMT, the ML4HMT-2011 workshop will explore alternatives in
order to provide optimal support for Hybrid MT design, using sophisticated
machine-learning techniques. One further important objective of the workshop
is to build bridges from MT to the ML community to systematically and
jointly explore the choice space for Hybrid MT.

The "Shared Task on Optimising the Division of Labour in Hybrid MT " is an
effort to trigger systematic investigation on improving state-of-the-art
Hybrid MT, using advanced machine-learning (ML) methodologies. Participants
are requested to build Hybrid/System Combination systems by combining the
output of several systems of different types, which is provided by the

About the OpenMT-2 Project
The main goal of the OpenMT-2 project is the development of Open Source
Machine Translation Architectures based on hybrid models and advanced
semantic processors. These architectures will be open-source systems
combining the three main Machine Translation frameworks –Rule-Based MT
(RBMT), Statistical MT (SMT) and Example-Based MT (EBMT)– into hybrid
systems. Implemented architectures and systems will be Open Source, so
it will allow rapid system adaptation or development of new advanced
Machine Translations systems for other languages. We will test system
functionality for different languages: English, Spanish, Catalan and
Basque; thus evaluating such architectures in different contexts. While
there are many corpus resources for English and Spanish, there are not
so many for Catalan and Basque. While the structure of some of those
languages is very similar (Catalan and Spanish), others are very
different (English and Basque). Basque is an agglutinative and highly
inflecting language, unlike English, Catalan and Spanish.

In parallel there has been extensive work on developing an automatic
Evaluation platform that supports the introduction of linguistically
motivated morphological, syntactic and semantic metrics into the design
of MT Evaluation methodologies. It also supports the development and
testing of concrete, linguistically-based evaluation techniques.

The main innovative points of the OpenMT-2 project are:
The design of hybrid systems combining traditional linguistic rules,
example-based methods and statistical methods.
The development of MT evaluation methods based on linguistically
motivated metrics.
The implementation of Open Source Systems.
The use of advanced syntactic and semantic processing in MT.

For further details, see the OpenMT-2 website:

Related Resources

ICDM 2023   International Conference on Data Mining
BIOSE 2023   6th International Conference on Bioscience & Engineering
GPGPU 2023   15th Workshop on General Purpose Processing Using GPU
IEEE COINS 2023   IEEE COINS 2023 - Berlin, Germany - July 23-25 - Hybrid (In-Person & Virtual) | Artificial Intelligence, Internet of Things (IoT), Blockchain, Big Data, Machine Learning
INCC 2023   IEEE--2023 International Conference on Information Network and Computer Communications (INCC 2023)
Distributed ML and Opt. 2023   Distributed Machine Learning and Optimization: Theory and Applications
ESANN 2023   European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning
blockchain_ml_iot 2023   Network (MDPI) Special Issue - Blockchain and Machine Learning for IoT: Security and Privacy Challenges
SI-MLT 2023   Special Issue on MACHINE LEARNING IN TOURISM - Int. J. of Machine Learning and Cybernetics (Springer)
DS 2023   Discovery Science 2023