posted by organizer: Raly || 260 views || tracked by 1 users: [display]

GermEval Task 1 2019 : Shared Task on hierarchical classification of German Blurbs

FacebookTwitterLinkedInGoogle

Link: https://competitions.codalab.org/competitions/21226
 
When Feb 1, 2019 - Aug 31, 2019
Where Erlangen, Germany
Submission Deadline Jul 15, 2019
Final Version Due Jul 31, 2019
Categories    hierarchical classification   multi-label classification   book blurbs   german texts
 

Call For Papers

GermEval 2019 Task 1 - Shared Task on hierarchical classification of German blurbs (short texts)

*Call for Participation*
We invite interested parties from academia and industry to participate in this shared task. Further information can be found here: https://competitions.codalab.org/competitions/21226.

Hierarchical multi-label classification (HMC) of blurbs is the task of classifying multiple labels for short descriptive texts of books, where each label is part of an underlying hierarchy of categories. The increasing amount of available digital documents and the need for more and finer-grained categories calls for new, more robust and sophisticated text classification methods. Large datasets often incorporate a categorical hierarchy, that can be used to organize information of documents on different levels of specificity. Traditional multi-class text classification approaches are thoroughly researched, however, since traditional approaches fail to generalize adequately with the increase of available data and the necessity of more specific hierarchies, the need for more robust and sophisticated classification methods increases.

With this task we aim to foster research within the HMC context. This task is focusing on classifying German books into their respective hierarchically structured categories using short advertisement texts (blurbs). The data contains additional metadata such as author, page number, release date, etc.


*Tasks*
This shared task consists of two subtasks, described below. Participants are free to participate in either one of them or both.

- *Subtask A*: The task is to classify German books into *one or multiple most general categories*. It can be thus be considered a non-hierarchical multi-label classification task. Eight classes can be assigned in total: 'Literatur & Unterhaltung', 'Ratgeber', 'Kinderbuch & Jugendbuch', 'Sachbuch', 'Ganzheitliches Bewusstsein', 'Glaube & Ethik', 'Künste, Architektur & Garten'.

- *Subtask B*: The second task targets hierarchical multi-label classification, where the full hierarchy of labels should be assigned to a book. In addition to the most general category (Subtask A), additional categories of different specificity can be assigned to a book. In total, 343 different classes can be assigned in a hierarchical structure of maximally 4 levels.


*Data*
The entire dataset consists of 20,784 examples in total. Sample data is provided in order to enable familiarization with the structure of the data. 14,548 training samples have been released and can be downloaded after registering for the shared tasks. A validation set (2,079 samples) has been published where gold labels have been held back. Submissions for the validation set via the codalab page are accepted and published on a leaderboard until June 1st. From June 1st, we will start the final evaluation phase of the task by providing the gold labels of the validation set, which can be used as additional training data. Additionally, the test set samples will be provided, for which we accept submissions until July, 15th. More information can be found on the task's webpage: https://competitions.codalab.org/competitions/21226


*Important Dates*
- January 2019: Release of trial data
- February 01, 2019: Release of training data (train + validation)
- June 01, 2019: Release of gold labels for validation set + test data
- July 15, 2019: Final deadline for submissions of test results
- July 31, 2019: Submission of description papers
- August 20, 2019: Notification of acceptance
- September 15, 2019: Camera-ready deadline for system description papers
- October 08, 2019: Workshop in Erlangen, Germany

The shared task will be accompanied by a pre-conference workshop of the Conference on Natural Language Processing ("Konferenz zur Verarbeitung natürlicher Sprache", KONVENS) hosted on October 8, 2019 at FAU Erlangen-Nuremberg (http://2019.konvens.org/).


*Workshop Proceedings*
Description papers will appear in online workshop proceedings. Participants who submit a description paper will be asked to register at the workshop and present their system as a poster or in an oral presentation (depending on the number of submissions).


*Organizers*
The task is organized by Rami Aly, Steffen Remus and Chris Biemann, Language Technology, Department of Informatics, Universität Hamburg, https://lt.informatik.uni-hamburg.de


*GermEval*
GermEval is a series of shared task evaluation campaigns that focus on Natural Language Processing for the German language. GermEval has been conducted four times since 2014 in co-location with KONVENS/GSCL conferences. For an overview of the currently conducted tasks, please see http://2019.konvens.org/germeval. We highly encourage readers to also take note of task 2 (Identification of offensive language, https://projects.fzai.h-da.de/iggsa/) and task 3 (Lemmatization of German Web and Social Media Texts, https://fau-klue.github.io/empirist-lemmatization/).

Related Resources

TextGraphs 2019   13th Workshop on Graph-based Methods for Natural Language Processing + Shared Task
MSR 2019   Second Workshop on Multilingual Surface Realization (+ Shared task)
#SMM4H 2019   #SMM4H: Social Media Mining for Health Applications Workshop & Shared Task at ACL 2019
FinSBD-2019 Shared Task 2019   [IJCAI-2019] Call for participation: FinSBD-2019 Shared Task - Sentence Boundary Detection in PDF Noisy Text in the Financial Domain
CL-SciSumm @SIGIR 2019   [CfP] CL-SciSumm @SIGIR 2019: 5th Computational Linguistics Scientific Document Summarization Shared Task
AIWolfDial SharedTask 2019   Call for Shared Task Participation: AI Werewolf and Dialog System (AIWolfDial2019)
CFP MEDDOCAN track 2019   CFP: Automated named entity and de-identification of medical document shared task: MEDDOCAN track and prize
NoDaLiDa Call for participation - FinTOC 2019   Call for participation - FinTOC shared task
SlavicNER @ BSNLP 2019   BSNLP-2019 2nd Edition of the Shared Task on Multilingual Named Entity Recognition for Slavic languages
AffCon 2019   AAAI-19 WORKSHOP ON AFFECTIVE CONTENT ANALYSIS & CL-AFF HAPPINESS SHARED TASK