posted by user: grupocole || 3337 views || tracked by 4 users: [display]

TWEETMT 2015 : TweetMT translation task

FacebookTwitterLinkedInGoogle

Link: http://komunitatea.elhuyar.org/tweetmt/
 
When Sep 15, 2015 - Sep 15, 2015
Where Alicante, Spain
Submission Deadline May 21, 2015
Notification Due Jun 12, 2015
Final Version Due Jul 3, 2015
Categories    NLP   text mining
 

Call For Papers

TWEETMT 2015: SECOND CALL

We would like to communicate some updates on the TweetMT translation
task. The development sets are now available on the web page and some public
data has been included. The date for the workshop is now confirmed to
September 15. We would also like to remind you that there is still time for
the registration till May 12.


Important dates
----------------------------------
March 1: Registration opened
April 21: Release of the development-set
May 12: Registration deadline
May 19: Release of the test-set
May 21: Result submission deadline
May 22-June 12: Manual evaluation. Publication of results
July 3: Short paper submission deadline
July 31: Papers' camera ready version
September 15: Workshop

You can find more information on the website
http://komunitatea.elhuyar.org/tweetmt/


--TWEET TRANSLATION WORKSHOP AT SEPLN 2015

TweetMT is a workshop and shared task on machine translation applied to
tweets. It will take place in September, 2015, in Alicante, co-located with
SEPLN 2015. The objective of the task is to bring together interested
researchers to join forces to experiment with and compare different
approaches to tweet MT. This workshop is a follow-up to two other workshops
organized previously also at SEPLN: TweetNorm2013 and TweetLID2014.

The machine translation of tweets is a complex task that greatly depends on
the type of data we work with. The translation process of tweets is very
different from that of correct texts posted for instance through a content
manager. Tweets are often written from mobile devices, which exacerbates the
poor quality of the spelling, and include errors, symbols and
diacritics. The texts also vary in terms of structure, where the latter
include tweet-specific features such as hashtags, user mentions, and
retweets, among others. The translation of tweets can be tackled as a direct
translation (tweet-to-tweet) or as an indirect translation (tweet
normalization to standard text (Kaufmann&Kalita, 2011), text translation
and, if needed, tweet generation). Although the first approach looks
attractive, the lack of parallel or comparable tweets for the working
languages (Petrovic et al., 2010) tends to lead us towards an indirect
approach. Some authors also try to gather similar tweets in other languages
(CLIR).

Work in this area is scarce in the literature but a growing interest is
evident (Gotti et al., 2013). An important point of reference is the work
done to translate SMS texts during the Haiti earthquake (Munro, 2010).

The current task will focus on MT of tweets between languages of the Iberian
Peninsula (Basque, Catalan, Galician, Portuguese and Spanish), as well as
English. The organizing committee has released development data including
parallel tweets that will enable participants to train their systems. For
the final evaluation participants will have to submit the automatic
translation of a number of tweet corpora in a short period of time. The
evaluation will be carried out using automatic distances to the reference
corpora.

These corpora are not meant to be representative of all types of messages
that can be observed in informal communication. This is instead an initial
attempt at tackling part of the task which starts by addressing one of its
simplest parts. We are planing on using more informal and varied corpora in
future tasks as we make progress on these initial issues.

The workshop aims to be a forum where researchers will have a chance to
compare their methods, systems and results.


Organizing committee
-----------------------------------------
I�aki Alegria (UPV/EHU)
Nora Aranberri (UPV/EHU)
Cristina Espa�a-Bonet (UPC)
Pablo Gamallo (USC)
Eva Mart�nez (UPC)
Hugo Oliveira (Universidade de Coimbra)
I�aki San Vicente (Elhuyar)
Antonio Toral (DCU, Dublin)
Arkaitz Zubiaga (University of Warwick)


Proceedings
--------------------------
The papers of the workshop will be published In the proceedings of "XXXI
Congreso de la Sociedad Espa�ola de Procesamiento de lenguaje natural". The
proceedings will be also published using the ceur-ws.org repository, and
will be indexed by DBLP, among others.

Related Resources

WMT-metrics 2024   WMT24 Metrics Task: Call for Participation
IEEE-Ei/Scopus-ITCC 2025   2025 5th International Conference on Information Technology and Cloud Computing (ITCC 2025)-EI Compendex
WMT-Testsuites 2024   'Help us break LLMs' - Test suite sub-task of the Ninth Conference on Machine Translation (WMT24)
ACL 2025   The 63rd Annual Meeting of the Association for Computational Linguistics
PerAnsSumm Shared Task @ CL4Health NAACL 2025   Shared Task on Perspective-aware Healthcare Answer Summarization at CL4Health Workshop [NAACL 2025]
DEPLING 2023   International Conference on Dependency Linguistics
CCITT 2025   4th International Conference on Computing and Information Technology Trends
SNLP 2025   6th International Conference on Semantic & Natural Language Processing
AT4SSL 2025   Third International Workshop on Automatic Translation for Sign and Spoken Languages
IberLEF 2025   [IberLEF 2025] Call for Task Proposals