posted by system || 2728 views || tracked by 8 users: [display]

SPMRL 2010 : First Workshop on Statistical Parsing of Morphologically Rich Languages


When Jun 5, 2010 - Jun 6, 2010
Where Los Angeles, CA, USA
Submission Deadline Mar 1, 2010
Notification Due Mar 30, 2010
Final Version Due Apr 12, 2010
Categories    NLP

Call For Papers

NAACL-HLT 2010 First Workshop on Statistical Parsing
of Morphologically Rich Languages (SPMRL 2010)
June 5 or 6, 2010, Los Angeles, CA
Submission Deadline: March 01, 2010
Sponsored by SIGPARSE

The aim of this workshop is to bring together researchers interested
in parsing languages with richer morphological structures than in
English, and to provide a forum for discussing the challenges
associated with parsing such languages and sharing strategies towards
their solutions. We are interested in presentations relating to
actively studied areas of research including the adaptation of
existing parsing techniques to new languages, the design of new models
that take morphological information into account, the implementation
of models that allow robust statistics to be obtained in the face of
high word-form variation, and so on.

Submission deadline: March 1, 2010
Notification to authors: March 30, 2010
Camera ready copy: April 12, 2010
Workshop: June 5 or 6, 2010


The availability of large syntactically annotated corpora led to an
explosion of interest in statistical parsing methods, and to the
development of successful models for parsing English using the Wall
Street Journal Penn Treebank (PTB, Marcus et al, 1993). In recent
years, parsing performance on the PTB has reached a performance
ceiling of 90-92% f-score using the Parseval evaluation metrics
(Black et al, 1991). When adapted to other language/treebank pairs
(such as German, Hebrew, Arabic, Italian or French), these models have
been shown to be considerably less successful.

Among the arguments that have been proposed to explain this
performance gap are the impact of small training data size,
differences in treebank annotation schemes, inadequacy of evaluation
metrics, as well as linguistic factors such as the degree of word
order freedom and the use of morphological information in the
parser. None of these arguments in isolation can account for the
systematic performance deterioration, but observed from a wider,
cross-linguistic perspective, a picture begins to emerge -- the
morphologically rich nature of some of the languages makes them
inherently more susceptible to such performance degradation.

Morphologically rich languages (MRLs) are particularly challenging for
the application of algorithms primarily designed to parse English.
These algorithms focus on learning word order but they often do not
take morphological information into account. Another typical problem
associated with parsing MRLs is increased lexical data sparseness due
to high morphological variation in surface forms. In a more general
setup, this problem is akin to handling out-of-vocabulary or rare
words for robust statistical parsing and techniques for domain
adaptation via lexicon enhancement (also explored for English and less
morphologically rich languages).

As well as technical and linguistic difficulties, lack of
communication between researchers working on different MRLs can lead
to a reinventing the wheel syndrome; the prominence of English parsing
in the literature reduces the visibility of research aiming to solve
the problems particular to MRLs. By offering a platform to this
growing community of interests we hope to overcome this potential
cultural obstacle.

We solicit papers describing parsing experiments with models and
architectures for languages with morphological structure richer than
English, or studies that address the lexical sparseness challenges
(for any language). The workshop's areas of interest include, but are
not limited to, the following list of topics:

* Parsing models and architectures that explicitly integrate
morphological analysis and parsing
* Parsing models and architectures that focus on lexical coverage
and the handling of OOV words either by incorporating linguistic
knowledge or through the use of unsupervised/semi-supervised
learning techniques
* Cross-language and cross-model comparison of models' strength and
weaknesses in the face of particular linguistic phenomena (e.g.
morphosyntactic characteristics, degree of word-order freedom ...)
* Comprehensive analyses of the strengths and weaknesses of various
parsing models on particular linguistic (e.g. morphosyntactic)
phenomena with respect to variation in tagsets, annotation schemes
and additional data transformations

Authors are invited to submit long papers (up to 8 pages + 1 extra
page for references) and short papers (up to 4 pages + 1 extra page
for references). Long papers should describe unpublished, substantial
and completed research. Short papers should be position papers,
papers describing work in progress or short, focused contributions.

Papers will be accepted until March 1, 2010 in PDF format
via the START system. Please watch the workshop page for additional
last-minute details:

Submitted papers must follow the styles and the formating guidelines
used for the NAACL conference, see the details at:

As the reviewing will be blind, the paper must not include the
authors' names and affiliations.
Furthermore, self-references that reveal the author's identity, e.g.,
"We previously showed (Smith, 1991) ..." must be avoided. Instead, use
citations such as "Smith previously showed (Smith, 1991) ..."
Papers that do not conform to these requirements will be rejected
without review. In addition, please do not post your submissions on
the web until after the review process is complete.

Djamé Seddah, Jennifer Foster, Sandra Kübler, Reut Tsarfaty,
Lamia Toumsi, Yannick Versley, Marie Candito, Ines Rehbein,
Yoav Goldberg

Mohamed Attia (Dublin City University, Ireland)
Adriane Boyd (Ohio State University, USA)
Aoife Cahill (University of Stuttgart, Germany)
Marie Candito (University of Paris 7, France)
Grzegorz Chrupala (Saarland University, Germany)
Benoit Crabbé (University of Paris 7, France)
Michael Elhadad (Ben Gurion University, Israel)
Jennifer Foster (Dublin City University, Ireland)
Josef van Genabith (Dublin City University, Ireland)
Yoav Goldberg (Ben Gurion University, Israel)
Julia Hockenmaier (University of Illinois, USA)
Deirdre Hogan (Dublin City University, Ireland)
Sandra Kübler (Indiana University, USA)
Alberto Lavelli (FBK-irst, Italy)
Joseph Le Roux (Dublin City University, Ireland)
Wolfgang Maier (University of Tübingen, Germany)
Takuya Matsuzaki (University of Toyko, Japan)
Detmar Meurers (University of Tübingen, Germany)
Yusuke Miyao (University of Toyko, Japan)
Joakim Nivre (Uppsala University, Sweden)
Ines Rehbein (Saarland University, Germany)
Kenji Sagae (University of Southern California, USA)
Djamé Seddah (University of Paris Sorbonne, France)
Khalil Sima'an (University of Amsterdam, The Netherlands)
Nicolas Stroppa (Yahoo! Research Paris, USA)
Lamia Toumsi (Dublin City University, Ireland)
Reut Tsarfaty (University of Amsterdam, The Netherlands)
Yannick Versley (University of Tübingen, Germany)

Sandra Kübler, Indiana University
Djamé Seddah, Université Paris-Sorbonne
Reut Tsarfaty, University of Amsterdam
to contact the organizers :

This worksop is sponsored by SIGPARSE and by the INRIA's Alpage

Related Resources

ASPLOS 2021   Architectural Support for Programming Languages and Operating Systems
ACL-IJCNLP 2021   59t Annual Meeting of the Association for Computational Linguistcs and the 10th International Joint Conference on Natural Language Processing
AS-RLPMTM 2021   Applied Sciences special issue Rich Linguistic Processing for Multilingual Text Mining
CBDA 2021   2nd International Conference on Big Data
OOPSLA 2020   Conference on Object-Oriented Programming Systems, Languages,and Applications
IJCSEIT 2020   International Journal of Computer Science, Engineering and Information Technology
SLSP 2020   8th International Conference on Statistical Language and Speech Processing
AIBD 2021   2nd International Conference on Artificial Intelligence and Big Data
EPID 2021   EPIDEMICS 8 - 8th International Conference on Infectious Disease Dynamics
CoSIT 2021   8th International Conference on Computer Science and Information Technology