KDD DSHealth Workshop 2023 : KDD DSHealth 2023: Workshop on Applied Data Science for Healthcare

posted by organizer: agisga || 2573 views || tracked by 2 users: [display]

KDD DSHealth Workshop 2023 : KDD DSHealth 2023: Workshop on Applied Data Science for Healthcare

Link: https://dshealthkdd.github.io/dshealth-2023/

When	Aug 6, 2023 - Aug 10, 2023
Where	Long Beach, CA
Submission Deadline	Jun 15, 2023
Notification Due	Jun 23, 2023

Categories artificial intelligence machine learning data science healthcare

Call For Papers

KDD DSHealth 2023: Workshop on Applied Data Science for Healthcare
Applications and New Frontiers of Generative Models for Healthcare

# Call For Paper #

Generative models have a long history and there are many application areas in medical machine learning (ML) and artificial intelligence (AI). With the development in deep neural networks, researchers focused on Generative Adversarial Networks (GANs), Variational Autoencoders (VAEs), and autoregressive models in the past years. More recently, very large deep generative models have gained popularity, including the large language models (LLMs) such as Generative Pre-trained Transformer 3 (GPT-3) and text-to-image diffusion models such as Stable Diffusion. In healthcare research, one of the most common applications of generative models has been the generation of synthetic data for training of machine learning models. It is often used to increase representation of patient subgroups to improve generalization and mitigate algorithmic biases. This is especially valuable in application domains where data is hard to come by. The generative models can also be used for specific model evaluation purposes (e.g., within a robustness or generalizability assessment; virtual clinical trials). They can help to generate synthetic ground truth data when labeling of data is extremely burdensome. Moreover, generative models have been successfully applied in data preprocessing or enhancement, such as image reconstruction or denoising deep learning algorithms in the medical imaging space. While such generative models have proven their utility in the health domain, many open questions remain with regard to the approaches for evaluation of their effectiveness and safety. Testing and evaluation of such models require specific considerations. Taking the assessment of the gap between the generated data and the reality — so called Sim2Real challenge — as an example, it is often unclear how to (i) quantify this domain gap and its impact on downstream performance in a meaningful manner and (ii) reduce it in order to fully leverage the potential of generative models. New challenges are also emerging on a more grand scale. The recent advances in Large Language Models (LLMs) makes the generation of data even more effortless. However, the misinformation that is generated with such models may cause a “pollution” of data for future model training. We can expect an increased need for effective fact checking approaches. Despite the huge growth of this area of research, the actual use of NLP technology for fact checking is still in its infancy.

In this half day workshop we would like to discuss some of the most common applications of generative models in the ML/AI research in the healthcare domain, the current challenges and also explore what are the potential new areas of application.

## Submission Guidelines ##

We invite full papers, as well as work-in-progress on the application of data science in healthcare. Topics may include, but not limited to, the following topics (For more information see workshop webpage)

Papers must be submitted in PDF format to easychair (https://easychair.org/conferences/?conf=dshealth2023) and formatted according to the new Standard ACM Conference Proceedings Template. Authors are encouraged to use the Overleaf template (https://www.overleaf.com/latex/templates/acm-conference-proceedings-primary-article-template/wbvnghjbzwpc).

Papers must be a maximum length of 4 pages, excluding references.

The program committee will select the papers based on originality, presentation, and technical quality for spotlight and/or poster presentation.

## List of Topics ##

* Synthetic data
- Training data augmentation, e.g. in computer vision, medical imaging algorithm
- Physics- and Chemistry- based generative models
- Simulated data and privacy preserving algorithms
- In-silico clinical trials
- Testing data, e.g. synthetic ground truth
- Generative AI for tabular data
- Interpretability
* Privacy and security of generative AI
- Inverse models for source verification
- Watermark for AI generated data
- Factual capabilities of generative AI
* Testing and evaluation of the generative models
- Sim2Real domain gap
- Data selection & quality aspects of the data (distribution shifts, monitoring of the models)
- Fact-checking
- Generating new healthcare-specific benchmarks
- Bias detection and mitigation in healthcare
- Reliability and trustworthiness of the generative models (actionable plans)
* Application of LLMs
- Systematic literature review
- Modernizing pharmaceutical call center operations
- Chatbot for patient registration, triage, scheduling, and rooming
- Semantic data augmentation
- Others
* Responsible use of Generative AI
- Generative AI Fairness and Bias detection
- Generative AI bias mitigation (e.g., adversarial training)
- Generative AI model transparency
- Generative AI ethics and responsible AI risk management
* Other
- Knowledge representation learning

## Organizing committee ##

Fei Wang, Cornell University, USA
Prithwish Chakraborty, IBM Research, USA
Tao Xu, F-Hoffmann la Roche, Switzerland
Pei-Yun Sabrina Hsueh, Bayesian Health Inc., USA
Gregor Stiglic, University of Maribor, Slovenia
Jiang Bian, University of Florida, USA
Lixia Yao, Merck, USA
Alexej Gossmann, FDA, USA
Florian Buettner, Frankfurt University/German Cancer Research Center (DKFZ), Germany

## Venue ##

The conference will be held as a half-day workshop in KDD 2023, Long beach, LA , on August 6, 2023.

Webpage: https://dshealthkdd.github.io/dshealth-2023/