TSAR 2024 : The third workshop on Text Simplification, Accessibility and Readability

posted by user: mshardlow || 1425 views || tracked by 2 users: [display]

TSAR 2024 : The third workshop on Text Simplification, Accessibility and Readability

When	Nov 15, 2024 - Nov 16, 2024
Where	Miami, Florida, USA
Submission Deadline	Sep 6, 2024
Notification Due	Sep 20, 2024
Final Version Due	Oct 4, 2024

Categories natural language processing accessibility artificial intelligence large language models

Call For Papers

The third workshop on Text Simplification, Accessibility and Readability (TSAR 2024)

Workshop at EMNLP 2022

URL: https://tsar-workshop.github.io/
Google Group: https://groups.google.com/g/tsarworkshop

Organisers:

Fernando Alva-Manchego, Cardiff University
Horacio Saggion, Pompeu Fabra University
Kai North, George Mason University
Marcos Zampieri, George Mason University
Matthew Shardlow, Manchester Metropolitan University
Sanja Štajner, Independent Researcher
Sian Gooding, Google

Call for Papers

The organisers are pleased to present the third instalment of the annual workshop on Text Simplification, Accessibility and Readability (TSAR 2024). TSAR aims to provide a cohesive environment to draw members of the computational linguistics, natural language processing and artificial intelligence communities working on the use of automated techniques to make language accessible for all. Previous editions of the workshop have been held at EMNLP in 2022 and RANLP in 2023 with significant engagement and participation driving research and building community in the fields of Text Simplification, Accessibility and Readability.

Research in automatic text simplification (TS) has often focused on proposing the use of methods derived from the deep learning paradigm (Glavaš and Štajner, 2015; Paetzold and Specia, 2016; Nisioi et al., 2017; Zhang and Lapata, 2017; Martin et al., 2020; Maddela et al., 2021; Sheang and Saggion, 2021). Recently, work in text simplification has leveraged the new era of foundational large language models through fine-tuning and prompt-engineering to produce simplifications (Kew et al. 2023; Cripwell et al. 2023; Farajidizaji et al. 2024).

However, there are many important aspects in automatic TS that require the attention of our community: the design of appropriate evaluation metrics (Štajner et al., 2022), the development of context-aware simplification solutions (Yimam et al., 2018; Shardlow et al., 2022a; Saggion et al., 2022), the creation of appropriate language resources to support research and evaluation (Maddela and Xu, 2018; Shardlow et al., 2020; Ferres and Saggion, 2022; North et al., 2022a), the deployment of simplification in real environments for real users (Lee and Yeung, 2018; Alonzo et al., 2022), the study of discourse factors in text simplification (Zhong et al., 2019), the identification of factors affecting the readability of a text (Shardlow et al., 2022b), among others. To overcome those issues, there is a need for collaboration of CL/NLP researchers, machine learning and deep learning researchers, UI/UX and Accessibility professionals, as well as public organisations representatives (Štajner, 2021).

Research on automatic text simplification, textual accessibility, and readability have the potential to improve social inclusion of marginalized populations. These related research areas have attracted attention in the past ten years, evidenced by the growing number of publications in NLP conferences. While only about 300 articles in Google Scholar mentioned TS in 2010, this number has increased to about 600 in 2015 and greater than 1000 in 2020 (Štajner, 2021). This number has certainly increased rapidly since 2020, at the recent LREC-COLING 2024 conference there were 50 new papers presented on related topics across the main conference and associated workshops.

The TSAR 2024 workshop builds upon the recent success of several regional workshops that covered a subset of our topics of interest, including READI Workshops at LREC 2022 and LREC 2024, SEPLN 2021 Workshop on Current Trends in Text Simplification (CTTS), the SimpleText workshop at CLEF 2021, as well as the birds-of-a-feather events on Text Simplification at NAACL 2021 (over 50 participants), ACL 2022 and EMNLP 2023.

Topics of Interest

Submissions to the workshop will be organised into two tracks, a main track and a special track with a focus on evaluation.

The main track will invite contributions on the following topics, aligned with previous editions of the workshop:
- Lexical simplification;
- Syntactic simplification;
- Discourse simplification;
- Document simplification;
- Modular and end-to-end TS;
- Sequence-to-sequence and zero-shot TS;
- Controllable TS;
- Text complexity assessment;
- Complex word identification and lexical complexity prediction;
- Corpora, lexical resources, and benchmarks for TS;
- Domain specific TS (e.g. health, legal);
- Assistive technologies for readability and comprehension beyond text.
- Other related readability and accessibility topics (e.g. empirical and eye-tracking studies).

The special track will focus on Evaluation of Text Simplification and Readability Systems. Papers in this track will be ranked and assessed separately to those in the main track. Topics include, but are not limited to:
- New evaluation measures and metrics for the assessment of text simplification, accessibility and/or readability;
- Reference based metrics;
- Referenceless metrics;
- Metrics at varying levels of granularity (Document/Paragraph/Sentence/Word/Sub-word);
- Readability based metrics;
- Alignment of new and existing metrics to human judgments;
- The role of LLMs in the evaluation of systems for text simplification, accessibility and/or readability.

Submissions
We welcome three types of papers: long papers, short papers and demos.
Long and Short Papers: We adhere to the same guidelines as EMNLP 2024.
Demo Papers: Demos should be no more than two (2) pages, including references, and should describe implemented systems related to the topics of interest of the workshop. It also should include a link to a short screencast of the working software. In addition, authors of demo papers must be willing to present a demo of their system during TSAR 2024.

Dates

Paper Submissions: 6th September
Paper Notifications: 20th September
Camera Ready: 4th October

Programme Committee

Anna Dmitrieva, University of Helsinki
Arne Jonsson, Linköping University
Christina Niklaus, University of St. Gallen
Daniel Wiechmann, University of Amsterdam
Daniele Schicchi, Università di Palermo
David Kauchak, Pomona College
Dennis Aumiller, Cohere
Emad Alghamdi, King Abdulaziz University
Felice Dell'Orletta, Istituto di Linguistica Computazionale "Antonio Zampolli"
Freya Hewett, Humboldt Institute for Internet and Society
Giulia Venturi, Institute of Computational Linguistics "Antonio Zampolli" (ILC-CNR)
Itziar Gonzalez-Dios, HiTZ Basque Center for Language Technologies - Ixa, University of the Basque Country UPV/EHU
Jaap Kamps, University of Amsterdam
Jan Trienes, University of Duisburg-Essen
Jasper Degraeuwe, Ghent University
Jipeng Qiang, Yangzhou University
Joseph Imperial, University of Bath
Laura Vásquez-Rodriguez, University of Manchester
Liam Cripwell, LORIA
Liana Ermakova, HCTI EA-4249, Université de Bretagne Occidentale
Maja Popović, ADAPT, Dublin City University
Margot Madina, Darmstadt University of Applied Sciences
Michael Gille, Hamburg University of Applied Science
Michael Ryan, Georgia Institute of Technology
Mounica Maddela, Georgia Institute of Technology
Natalia Grabar, CNRS STL UMR8163, Université de Lille
Oliver Alonzo, Rochester Institute of Technology
Philippe Laban, Salesforce Research
Piotr Przybyła, Universitat Pompeu Fabra
Raquel Hervas, University Complutense of Madrid
Regina Stodden, Computational Linguistics Department, Heinrich Heine University Düsseldorf
Rémi Cardon, CENTAL, ILC, Université Catholique de Louvain
Sarah Ebling, University of Zurich
Silvana Deilen, University of Hildesheim
Sowmya Vajjala, National Research Council Canada
Susana Bautista, Universidad Francisco de Vitoria
Sweta Agrawal, University of Maryland
Tadashi Nomoto, National Institute of Japanese Literature
Tomas Goldsack, University of Sheffield
Victoria Yaneva, National Board of Medical Examiners
Yannick Parmentier, University of Lorraine

References

Oliver Alonzo, Sooyeon Lee, Mounica Maddela, Wei Xu, and Matt Huenerfauth. 2022. A dataset of word-complexity judgements from deaf and hard-of- hearing adults for text simplification. In Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022). Association for Computational Linguistics.

Liam Cripwell, Joël Legrand, and Claire Gardent. 2023. Document-Level Planning for Text Simplification. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics, pages 993–1006, Dubrovnik, Croatia. Association for Computational Linguistics.
Asma Farajidizaji, Vatsal Raina, and Mark Gales. 2024. Is It Possible to Modify Text to a Target Readability Level? An Initial Investigation Using Zero-Shot Large Language Models. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 9325–9339, Torino, Italia. ELRA and ICCL.
Daniel Ferres and Horacio Saggion. 2022. ALEXSIS: A dataset for lexical simplification in Spanish. In Proceedings of LREC.

Goran Glavaš and Sanja Štajner. 2015. Simplifying Lexical Simplification: Do We Need Simplified Corpora? In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL, pages 63–68.

Tannon Kew, Alison Chi, Laura Vásquez-Rodríguez, Sweta Agrawal, Dennis Aumiller, Fernando Alva-Manchego, and Matthew Shardlow. 2023. BLESS: Benchmarking Large Language Models on Sentence Simplification. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 13291–13309, Singapore. Association for Computational Linguistics.
John Lee and Chak Yan Yeung. 2018. Personalizing lexical simplification. In Proceedings of the 27th International Conference on Computational Linguistics, Santa Fe, New Mexico, USA. Association for Computational Linguistics.

Mounica Maddela, Fernando Alva-Manchego, and Wei Xu. 2021. Controllable text simplification with explicit paraphrasing. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3536–3553.

Mounica Maddela and Wei Xu. 2018. A word-complexity lexicon and a neural readability ranking model for lexical simplification. In Proceedings of EMNLP.

Louis Martin, Éric de la Clergerie, Benoît Sagot, and Antoine Bordes. 2020. Controllable sentence simplification. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 4689–4698.

Sergiu Nisioi, Sanja Štajner, Simone Paolo Ponzetto, and Liviu P. Dinu. 2017. Exploring neural text simplification models. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), pages 85–91.

Kai North, Marcos Zampieri, and Tharindu Ranasinghe. 2022. ALEXSIS-PT: A new resource for portuguese lexical simplification. In COLING.

Gustavo Henrique Paetzold and Lucia Specia. 2016. Unsupervised lexical simplification for non-native speakers. In Proceedings of the 30th AAAI.

Horacio Saggion, Sanja Štajner, Daniel Ferrés, Kim Cheng Sheang, Matthew Shardlow, Kai North, and Marcos Zampieri. 2022. Findings of the TSAR-2022 Shared Task on Multilingual Lexical Simplification. In Proceedings of TSAR.

Matthew Shardlow, Michael Cooper, and Marcos Zampieri. 2020. CompLex — a new corpus for lexical complexity prediction from Likert Scale data. In Proceedings of READI.

Matthew Shardlow, Richard Evans, Gustavo Paetzold, and Marcos Zampieri. 2022a. Semeval-2021 task 1: Lexical complexity prediction. In Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval-2021), Barcelona, Spain.

Matthew Shardlow, Richard Evans, and Marcos Zampieri. 2022b. Predicting Lexical Complexity
in English Texts: The Complex 2.0 Dataset. Lang Resources & Evaluation, 56(4):1153–1194.

Kim Cheng Sheang and Horacio Saggion. 2021. Controllable sentence simplification with a unified text-to-text transfer transformer. In Proceedings of the 14th International Conference on Natural Language Generation, pages 341–352, Aberdeen, Scotland, UK. Association for Computational Linguistics.

Sanja Štajner, Daniel Ferrés, Matthew Shardlow, Kai North, Marcos Zampieri, and Horacio Saggion. 2022. Lexical Simplification Benchmarks for English, Portuguese, and Spanish. Frontiers in Artificial Intelligence.

Sanja Štajner. 2021. Automatic text simplification for social good: Progress and challenges. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 2637–2652.

Seid Muhie Yimam, Chris Biemann, Shervin Malmasi, Gustavo Paetzold, Lucia Specia, Sanja Štajner, Anaïs Tack, and Marcos Zampieri. 2018. A Report on the Complex Word Identification Shared Task 2018. In Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications, New Orleans, United States. Association for Computational Linguistics.

Xingxing Zhang and Mirella Lapata. 2017. Sentence Simplification with Deep Reinforcement Learning. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 584–594.

Yang Zhong, Chao Jiang, Wei Xu, and Junyi Jessy Li. 2019. Discourse level factors for sentence deletion in text simplification. In Proceedings of AAAI.