| |||||||||||||
WePS 3 2010 : Third WePS Evaluation Workshop: Searching Information about Entities in the Web | |||||||||||||
Link: http://nlp.uned.es/weps | |||||||||||||
| |||||||||||||
Call For Papers | |||||||||||||
Previous WePS campaigns have been focused on the people search task:
the first campaign addressed the name ambiguity problem, defining the task as a clustering of web search results for a given person name, aiming at one cluster per person sharing the name. The second campaign used a refined version of the evaluation metrics and added an attribute extraction task for web documents returned by the search engine for a given person name. In WePS-3 we aim at merging both problems into one single task, where the system must return both the documents and the attributes for each of the different people sharing a given name. This is not a trivial step from the point of view of evaluation: a system may correctly extract attribute values from different URLs but then incorrectly merge them into person profiles. In addition, WePS-3 adds a task which considers, for the first time, another relevant type of named entity: organizations. We will focus on name ambiguity for organizations, which is a highly relevant problem faced by Online Reputation Management systems. Take, for instance, the online company Amazon. In order to trace mentions and opinions about Amazon in web data (including news and blog feeds and input from social networks), the system must filter out alternative senses of “Amazon” (the South American river, the nation of female warriors, etc.). But such filtering cannot be done by liberally adding keywords to a query (e.g. “amazon online store”), because that may harm recall, and recall is crucial for reputation management. * Task definitions WePS 3 will be a competitive evaluation campaign including two tasks concerning the Web entity search problem: ** Task 1: Clustering and Attribute Extraction for Web People Search Task 1 is related to Web People Search and focuses on person name ambiguity and person attribute extraction on Web pages. Given a set of web search results for a person name, the task is to cluster the pages according to the different people sharing the name and extract certain biographical attributes for each person (i.e., for each cluster of documents). Guidelines for the WePS-3 Person Name Disambiguation Task (http://nlp.uned.es/weps/weps-3/guidelines/41-guidelines-for-the-weps-3-person-name-disambiguation-task) Guidelines for the WePS-3 Attribute Extraction Subtask (http://nlp.uned.es/weps/weps-3/guidelines/42-guidelines-for-the-weps-3-attribute-extraction-subtask) ** Task 2: Name ambiguity resolution for Online Reputation Management (ORM) Task 2 is related to Online Reputation Management (ORM) for organizations and focuses on the problem of ambiguity for organization names and the relevance of Web data for reputation management purposes. The motivation is to help experts in reputation management and alert services. Nowadays, the ambiguity of names is an important bottleneck for these experts. Twitter has been chosen as target data because it is a critical source for real time reputation management and also because ambiguity resolution is challenging: tweets are minimal and little context is available for resolving name ambiguity. The task is defined as follows: given a set of Twitter entries containing an (ambiguous) company name, and given the home page of the company, the task is to discriminate entries that do not refer to the company. Entries will be given in two languages: English and Spanish. Guidelines for the WePS-3 On-line Reputation Management Task (http://nlp.uned.es/weps/weps-3/guidelines/40-guidelines-for-the-weps-3-on-line-reputation-management-task) * Participation A team can choose to participate in both Task 1 and Task 2 or only in one of them. In Task 1 Clustering is mandatory and Attribute Extraction optional (i.e. teams that perform the Attribute Extraction subtask are required to complete the Clustering task too). The organizers will provide annotated data for developing/training systems (read the task guidelines at http://nlp.uned.es/weps/weps-3/guidelines for more details). On a second stage, an unannotated corpus will be distributed, systems output will be collected and evaluation results returned to the participants. Each team can submit up to five runs for each task (Clustering, Attribute Extraction and ORM). Every team is expected to write a paper describing their system and discussing the evaluation results. The results of the evaluation campaign will be discussed in a one day workshop as a CLEF 2010 Lab in Padua (Italy), 22 or 23 September 2010. * How do I register ? Please send an email expressing your interest to the task organizers (weps-organizers@lsi.uned.es). State the name of your research group, a contact e-mail and the task(s) in which you intend to participate (Task 1 clustering only, Task 1 clustering + attribute extraction, Task 2). * Important Dates Release of trial data: 15 February 2010 Release of test data: 1 June 2010 Submissions due: 15 June 2010 Release of official results: 15 July 2010 Papers due: 15 August 2010 Workshop: 20, 21 or 22 September (CLEF 2010, Padua) * Organizers The general lab coordinators are: Julio Gonzalo (UNED, Madrid), julio@lsi.uned.es Satoshi Sekine (NYU, New York), sekine@cs.nyu.edu The coordinators for Task 1 (people search) are: Javier Artiles (UNED, Madrid), javart@bec.uned.es Andrew Borthwick (Intelius Corp., Palo Alto), aborthwick@intelius.com The coordinators for Task 2 (organizations search) are: Bing Liu (University of Illinois at Chicago), liub@cs.uic.edu Enrique Amigó (UNED, Madrid), enrique@lsi.uned.es Adolfo Corujo (Llorente & Cuenca, Madrid), acorujo@llorenteycuenca.com * Program Committee Eneko Agirre, EHU, Spain Breck Balwin, Alias-i, USA Danushka Bollegala, Tokyo University, Japan Jeremy Ellman, Northumbria University, UK Donna Harman, National Institute of Standards and Technology (NIST), USA Eduard Hovy, ISI, USA Dmitri Kalashnikov, University of California, USA Paul Kalmar, USA Bernardo Magnini, FBK-irst, Italy Gideon Mann, Google, USA Yutaka Matsuo, Tokyo University, Japan Manabu Okumura, Tokyo Inst. of Tech., Japan Ted Pedersen, University of Minnesota, USA Massimo Poesio, University of Essex, UK Maarten de Rijke, University of Amsterdam, Netherlands Jamie Taylor, Freebase, USA Mark Sanderson, University of Sheffield, UK Arjen P. de Vries, Centrum Wiskunde & Informatica, Netherlands Updated information about the task can be found at the WePS web site (http://nlp.uned.es/weps). |
|