posted by user: grupocole || 855 views || tracked by 1 users: [display]

PaDAWan 2024 : 1st Portuguese Data Augmentation Workshop

FacebookTwitterLinkedInGoogle

Link: https://sites.google.com/view/padawan-2024/
 
When Nov 17, 2024 - Nov 21, 2024
Where Belém, Pará, Brazil
Submission Deadline Sep 10, 2024
Notification Due Oct 5, 2024
Final Version Due Oct 13, 2024
Categories    NLP   computational linguistics   artificial intelligene
 

Call For Papers

PaDAWan
2024: 1st Portuguese Data Augmentation Workshop (PaDAWan)

Belém,
Pará, Brazil

collocated
with STIL 2024

November
17th to 21th 2024

1st
Call for Papers

https://sites.google.com/view/padawan-2024/

********************************************************

The Portuguese Data Augmentation
Workshop (PaDAWan) aims to gather the community working on Data Augmentation, particularly employing Large Language Models (LLMs), in Portuguese.

With the advancement of LLMs,
many traditional Natural Language Processing (NLP) tasks are being revisited. One traditional key challenge is gathering high-quality data for training and evaluating specific tasks. This has often been the main bottleneck in developing machine learning models.
Data augmentation has become a crucial technique for enhancing the performance of these models across various tasks, especially when reliable data are limited. Nowadays, particularly with the use of LLMs, it has become feasible to apply sophisticated text
data augmentation techniques effectively.

The use of LLMs is still very
restricted due to several factors, such as costs, privacy concerns, latency issues, and other challenges. Given the current scenario, using LLMs to generate synthetic data to train classical models for specific tasks is a viable approach. Moreover, while many
works in the industry consider synthetic data, scientific discussions on methods and evaluations are not always aligned with market necessities.

This workshop aims to delve into
the use of LLMs for data augmentation, exploring possible methods, evaluation techniques, and associated ethical considerations. The goal is to bring together both industry professionals and academics to deeply discuss the topic.

We invite researchers to submit
papers that discuss challenges and advances in Portuguese data generation, including but not limited to the following topics:



Data creation and data labeling



Data reformation and anonymization



Data contamination and noise



Co-annotation



Augmented data evaluation and
controlled data augmentation



Ethics in generated data and unbiased
data generation



Practical applications or case
studies of data augmentation techniques



Challenges in Portuguese Synthetic/Augmented
Data

*Submissions*

We
invite both unpublished work, to be published in a special section of STIL Proceedings, and lightning talks proposals highlighting already published work.

Submission
deadline: September 10, 2024

Notification
for authors: October 5, 2024

Camera-ready
versions due: October 13, 2024

For more information, please access:
https://sites.google.com/view/padawan-2024/

For any doubts, please write to
padawan.workshop@gmail.com

Related Resources

ACM SAC 2025   40th ACM/SIGAPP Symposium On Applied Computing
IEEE-Ei/Scopus-ITCC 2025   2025 5th International Conference on Information Technology and Cloud Computing (ITCC 2025)-EI Compendex
JBCS-LMP 2025   Journal of the Brazilian Computer Society Special Issue on Language Models for Portuguese
RANLP 2025   Recent Advances in Natural Language Processing
CoUDP 2025   2025 International Conference on Urban Design and Planning (CoUDP 2025)
KONVENS 2025   Conference on Natural Language Processing
Ei/Scopus-IPCML 2025   2025 International Conference on Image Processing, Communications and Machine Learning (IPCML 2025)
TSD 2025   Twenty-eighth International Conference on Text, Speech and Dialogue
Ei/Scopus- CCRIS 2025   2025 IEEE 6th International Conference on Control, Robotics and Intelligent System (CCRIS 2025)
BigData 2025   2025 IEEE International Conference on Big Data