posted by user: grupocole || 5192 views || tracked by 12 users: [display]

ROBUS 2011 : Workshop on Robust Unsupervised and Semisupervised Methods


When Sep 15, 2011 - Sep 16, 2011
Where Hissar, Bulgaria
Submission Deadline Jul 15, 2011
Notification Due Aug 15, 2011
Categories    NLP

Call For Papers

Workshop on Robust Unsupervised and Semisupervised Methods in Natural Language Processing (ROBUS2011)


Workshop on Robust Unsupervised and Semisupervised Methods
in Natural Language Processing
(in conjunction with RANLP 2011)
Hissar, Bulgaria, 15/16. September 2011



In natural language processing (NLP), supervised learning scenarios are more frequently explored than unsupervised or semi-supervised ones. Unfortunately, labeled data are often highly domain-dependent and short in supply. It has therefore become increasingly important to leverage both labeled and unlabeled data to achieve the best performance in challenging NLP problems that involve learning of structured variables.

Until recently most results in semi-supervised learning of structured variables in NLP were negative (Abney, 2008), but today the best part-of-speech taggers (Suzuki et al., 2008), named entity recognizers (Turian et al., 2010), and dependency parsers (Sagae and Tsujii, 2007; Suzuki et al., 2009; Søgaard and Rishøj, 2010) exploit mixtures of labeled and unlabeled data. Unsupervised and minimally unsupervised NLP also sees rapid growth.

The most commonly used semi-supervised learning algorithms in NLP are feature-based methods (Koo et al., 2008; Sagae and Gordon, 2009; Turian et al., 2010) and EM, self- or co-training (Mihalcea, 2004; Sagae and Tsujii, 2007; Spoustova et al., 2009). Mixture models have also been successfully used (Suzuki and Isozaki, 2008; Suzuki et al., 2009). While feature-based methods seem relatively robust, self-training and co-training are very parameter-sensitive, and parameter tuning has therefore become an important research topic (Goldberg and Zhu, 2009). This is not only a concern in NLP, but also in other areas such as face recognition, e.g. Yan and Wang (2009). Parameter-sensitivity is even more dramatic in unsupervised learning of structured variables, e.g. unsupervised part-of-speech tagging and grammar induction.

By more robust unsupervised or semi-supervised learning algorithms we mean algorithms with few parameters that give good results across different data sets and different applications.

Specifically, we encourage submissions on the following topics:
* assessing robustness of known or new unsupervised or semi-supervised methods across different NLP problems or languages
* new unsupervised or semi-supervised methods for NLP problems
* positive and negative results on using of unsupervised or semi-supervised methods in applications
* application-oriented evaluation of unsupervised or semi-supervised methods
* comparison and combination of unsupervised or semi-supervised methods

This workshop aims to bring together researchers dedicated to designing and evaluating robust unsupervised or semi-supervised learning algorithms for NLP problems. This includes, but is not limited to POS tagging, grammar induction and parsing, named entity recognition, word sense induction and disambiguation, machine translation, sentiment analysis and taxonomy learning. Our goal is to evaluate known unsupervised and semi-supervised learning algorithms, foster novel and more robust ones and discuss positive and negative results that may otherwise not appear in a technical paper at a major conference. We welcome submissions that address the robustness of unsupervised or semi-supervised learning algorithms for NLP, and especially encourage authors to provide results for different data sets, languages or applications.

Submission deadline: July 15 2011.
Notification: August 15 2011.
Workshop: September 15-16 2011.

Use the RANLP style sheets found here:
We invite long (8) and short (4) papers. All papers will appear in the ACL bibliography. (Accepted short papers will be presented either as short oral presentations or as posters.)

* Steven Abney, University of Michigan
* Stefan Bordag, ExB Research & Development
* Eugenie Giesbrecht, FZI Karlsruhe
* Katja Filippova, Google
* Florian Holz, University of Leipzig
* Jonas Kuhn, University of Stuttgart
* Vivi Nastase, HITS Heidelberg
* Reinhard Rapp, JG University of Mainz
* Lucia Specia, University of Wolverhampton
* Valentin Spitkovsky, Stanford University
* Sven Teresniak, University of Leipzig
* Dekai Wu, HKUST
* Torsten Zesch, TU Darmstadt
* Jerry Zhu, University of Wisconsin-Madison

Chris Biemann, TU Darmstadt
Anders Søgaard, University of Copenhagen

CONTACT: soegaard(at)

Steven Abney. 2008. Semi-supervised learning for computational linguistics. Chapman & Hall.
Andrew Goldberg and Jerry Zhu. 2009. Keepin' it real: semi-supervised learning with realistic tuning. In NAACL.
Terry Koo et al. 2008. Simple semi-supervised dependency parsing. In ACL-HLT.
Rada Mihalcea. 2004. Co-training and self-training for word sense disambiguation. In CoNLL.
Kenji Sagae and Jun'ichi Tsujii. 2007. Dependency parsing and domain adaptation with LR models and parser ensembles. In CoNLL Shared Task.
Kenji Sagae and Andrew Gordon. 2009. Clustering words by syntactic similarity improves dependency parsing of predicate-argument structures. In IWPT.
Drahomira Spoustova et al., 2009. Semi-supervised training for the averaged perceptron POS tagger. In EACL.
Jun Suzuki and Hideki Isozaki. 2008. Semi-supervised sequential labeling and segmentation using giga-word scale unlabeled data. In ACL-HLT.
Jun Suzuki et al. 2009. An empirical study of semi-supervised structured conditional models for dependency parsing. In EMNLP.
Anders Søgaard and Christian Rishøj. 2010. Semi-supervised dependency parsing using generalized tri-training.
Joseph Turian et al. 2010. Word representations: a simple and general method for semi-supervised learning. In ACL.
Shuicheng Yan and Huan Wang. 2009. Semi-supervised learning by sparse representation. In SIAM Data Mining.

Related Resources

FM 2023   Formal Methods
NLPTA 2022   3rd International Conference on NLP Techniques and Applications
TACAS 2023   29th International Conference on Tools and Algorithms for the Construction and Analysis of Systems
IEEE Big Data - MMBD 2022   IEEE Big Data 2022 Workshop on Multimodal Big Data
RAI 2022   KI Workshop on Robust AI for High-Stakes Applications
BIOS 2022   8th International Conference on Bioinformatics & Biosciences
ACII 2022   Advanced Computational Intelligence: An International Journal
SI NBMiBS 2022   SPECIAL ISSUE on Novel Bayesian Methods in Biopharmaceutical Statistics
NLTM 2023   3rd International Conference on NLP & Text Mining