|
| |||||||||||||||
SLiDE 2026 : Workshop on Structured Linguistic Data and Evaluation | |||||||||||||||
| |||||||||||||||
Call For Papers | |||||||||||||||
|
[apologies for cross-postings] Workshop on Structured Linguistic Data and Evaluation (SLiDE) A full-day workshop at LREC 2026, 11-16 May 2026, Palma, Mallorca (Spain) In the last ten years, significant advances in deep learning models and the development of Large Language Models (LLMs) have revolutionized the fields of computational linguistics (CL) and natural language processing (NLP). In turn, this has led to a complete re-assessment of the language resources and evaluation practices necessary for training LLMs and analyzing their outputs. In particular, the availability of very large amounts of unstructured data for training foundational models has come into focus, while the value of high-quality structured linguistic data with rich annotations at various levels of linguistic analysis has been downplayed by comparison. However, as CL and NLP practitioners engage further with LLMs and debate their strengths and weaknesses, the importance of high-quality, structured linguistic data has been re-emphasized. The proposed workshop can be seen as related to the Treebanks and Linguistic Theories (TLT) conference series and the more recent SyntaxFest venue. Over the years, these venues have provided a central forum for high-quality research on treebanks, syntactic theory, syntax-semantics interface, structured meaning representations, and annotated linguistic resources. With record participation in recent years, they demonstrate the vitality and relevance of this line of work. The Workshop on Structured Linguistic Data is conceived as both a continuation of this tradition and an adaptation to the new realities of an LLM-dominated research landscape. The workshop will bring together researchers from these overlapping traditions to advance methods, resources, and practices for integrating structured linguistic data into the LLM era. Topics of interest include but are not limited to: Linguistic Data Analyses, Language Resources, and Evaluation Grammar processing with NLP and LLM-based tools Phonological and morphological analysis and LLM tokenization Annotation strategies with LLM-empowered methodologies and tools Design principles and annotation schemes for structured linguistic data Multi-lingual and cross-lingual settings Mapping of structured linguistic data to Linked Open Data resources Evaluation informed by language typology Language resources for under-resourced and endangered languages The use of structured linguistic data for NLP applications The use of structured linguistic data in acquiring linguistic knowledge (Semi-)automatic methods for creating structured linguistic data Spoken language Data Speech-to-text applications Speech Generation techniques Speech data preparation, curation and evaluation Multimodality and Situated Dialogue Structured multimodal resources: gesture AMR (GAMR), gaze and posture annotation, multimodal dialogue corpora. Multimodal grounding: linking language with visual, gestural, and action representations Structured representations for co-attention and alignment in multiparty dialogue Multimodal evaluation resources for LLMs Pragmatics and Discourse Structured data for discourse and dialogue: discourse relation annotation, coherence structures, dialogue acts Pragmatic annotation (speech acts, presupposition, implicature, politeness, stance) Structured approaches to common ground tracking and Theory of Mind in LLMs Semantics and Lexical Meaning Dependency analysis and semantic parsing Annotation beyond syntax: semantics, pragmatics and discourse Structured data for lexical semantics: sense inventories, semantic frames, qualia structure, and type-theoretic resources Computational semantics resources: Abstract Meaning Representation (AMR), Universal Meaning Representation (UMR), Discourse, Representation Structures, Minimal Recursion Semantics (MRS), Type Theory with Records (TTR) Distributional and neural-symbolic representations of lexical meaning: (e.g., Holographic Reduced Representations (HRR), hyperdimensional computing) for structured LLM grounding Aligning vector-based meaning representations with symbolic/typed structures We invite paper submissions in two distinct tracks: regular papers on substantial and original research, including empirical evaluation results, where appropriate – 6 to 8 pages excluding references and potential ethics statements; short papers on smaller, focused contributions, work in progress, negative results, surveys, or opinion pieces – 4 to 6 pages excluding references and potential ethics statements. Invited speakers Naiara Perez (University of the Basque Country) Shira Wein (Amherst College) Paper Submission and Templates Submission follows the LREC 2026 conference instructions, using the START conference management system. The submission link will be provided as soon as it becomes available. Submissions should follow the LREC stylesheet, available on the conference website on the Author’s kit page. Papers must be anonymized to support double-blind reviewing. Important Dates February 22, 2026: Paper submission deadline March 15, 2026: Notification of acceptance March 25, 2026: Camera-ready papers May 2026: Workshop at LREC 2026 All deadlines are 11.59 pm UTC -12h (“anywhere on Earth”). Workshop Organizers Jan Hajič (Charles University, Czech Republic) Erhard Hinrichs (Tübingen University, Germany) Sandra Kübler (Indiana University, USA) Joakim Nivre (Uppsala University, Sweden) Petya Osenova (Sofia University and IICT-BAS, Bulgaria) James Pustejovsky (Brandeis University, USA) |
|