EDML 2019 : 1st Workshop on Evaluation and Experimental Design in Data Mining and Machine Learning @ SDM 2019

posted by organizer: azimmerm || 7932 views || tracked by 10 users: [display]

EDML 2019 : 1st Workshop on Evaluation and Experimental Design in Data Mining and Machine Learning @ SDM 2019

Link: http://www.cip.ifi.lmu.de/~schubert/EDML/

When	May 2, 2019 - May 4, 2019
Where	Calgary
Submission Deadline	Feb 15, 2019
Notification Due	Mar 15, 2019

Categories machine learning data mining data quality evaluation design

Call For Papers

*Description*

A vital part of proposing new machine learning and data mining approaches is evaluating them empirically to allow an assessment of their capabilities. Numerous choices go into setting up such experiments: how to choose the data, how to preprocess them (or not), potential problems associated with the selection of datasets, what other techniques to compare to (if any), what metrics to evaluate, etc. and last but not least how to present and interpret the results. Learning how to make those choices on-the-job, often by copying the evaluation protocols used in the existing literature, can easily lead to the development of problematic habits. Numerous, albeit scattered, publications have called attention to those questions and have occasionally called into question published results, or the usability of published methods. At a time of intense discussions about a reproducibility crisis in natural, social, and life sciences, and conferences such as SIGMOD, KDD, and ECML/PKDD encouraging researchers to make their work as reproducible as possible, we therefore feel that it is important to bring researchers together, and discuss those issues on a fundamental level.

A related issue is directly related to the first choice mentioned above: even the best-designed experiment carries only limited information if the underlying data are lacking. We therefore also want to discuss questions related to the availability of data, whether they are reliable, diverse, and whether they correspond realistic and/or challenging problem settings.

*Topics*

In this workshop, we mainly solicit contributions that discuss those questions on a fundamental level, take stock of the state-of-the-art, offer theoretical arguments, or take well-argued positions, as well as actual evaluation papers that offer new insights, e.g. question published results, or shine the spotlight on the characteristics of existing benchmark data sets.

As such, topics include, but are not limited to:
- Benchmark datasets for data mining tasks: are they diverse/realistic/challenging?
- Impact of data quality (redundancy, errors, noise, bias, imbalance, ...) on qualitative evaluation
- Propagation/amplification of data quality issues on the data mining results (also interplay between data and algorithms)
- Evaluation of unsupervised data mining (dilemma between novelty and validity)
- Evaluation measures
- (Automatic) data quality evaluation tools: What are the aspects one should check before starting to apply algorithms to given data?
- Issues around runtime evaluation (algorithm vs. implementation, dependency on hardware, algorithm parameters, dataset characteristics)
- Design guidelines for crowd-sourced evaluations

The workshop will feature a mix of invited speakers, a number of accepted presentations with ample time for questions since those contributions will be less technical, and more philosophical in nature, and a panel discussion on the current state, and the areas that most urgently need improvement, as well as recommendation to achieve those improvements. Workshop submissions will be published in the CEUR-WS workshop series. An important objective of this workshop is a document synthesizing these discussions that we intend to publish at a more prominent venue.

*Submission*

Papers should be submitted as PDF, using the SIAM conference proceedings style, available at https://www.siam.org/Portals/0/Publications/Proceedings/soda2e_061418.zip?ver=2018-06-15-102100-887. Submissions should be limited to nine pages and submitted viat Easychair at https://easychair.org/conferences/?conf=edml19.

*Important dates*

Submission deadline: February 15, 2019
Notification deadline: March 15, 2019
SDM pre-registration deadline: April 2, 2019
Camera ready: April 15, 2019
Conference dates: May 2-4, 2019

*Organizers*

Eirini Ntoutsi, Leibniz University Hannover & L3S Research Center, Germany, ntoutsi@kbs.uni-hannover.de
Erich Schubert, Technical University Dortmund, Germany, erich.schubert@cs.tu-dortmund.de
Arthur Zimek, University of Southern Denmark, zimek@imada.sdu.dk
Albrecht Zimmermann, University Caen Normandy, France, albrecht.zimmermann@unicaen.fr

The workshop's website can be found at https://imada.sdu.dk/Research/EDML/

Related Resources

Ei/Scopus-CCNML 2025 2025 5th International Conference on Communications, Networking and Machine Learning (CCNML 2025)

Ei/Scopus-SGGEA 2025 2025 2nd Asia Conference on Smart Grid, Green Energy and Applications (SGGEA 2025)

Ei/Scopus-MLBDM 2025 2025 5th International Conference on Machine Learning and Big Data Management (MLBDM 2025)

IEEE-ACAI 2025 2025 IEEE 8th International Conference on Algorithms, Computing and Artificial Intelligence (ACAI 2025)

IEEE- CCRIS 2025 2025 IEEE 6th International Conference on Control, Robotics and Intelligent System (CCRIS 2025)

IEEE-Ei/Scopus-PRDM 2025 2025 6th International Conference on Pattern Recognition and Data Mining (PRDM 2025)

Ei/Scopus-AI2A 2025 2025 5th International Conference on Artificial Intelligence, Automation and Algorithms (AI2A 2025)

IEEE-MLNLP 2025 2025 IEEE 8th International Conference on Machine Learning and Natural Language Processing (MLNLP 2025)

CGASP 2025 International Conference on Computer Graphics, Animation & Signal Processing

AAIML 2026 IEEE--2026 International Conference on Advances in Artificial Intelligence and Machine Learning