| |||||||||
CfP: PEVA Special Issue 2024 : CfP: PEVA Special Issue: Performance Analysis and Evaluation of Systems for Artificial Intelligence | |||||||||
Link: https://www.sciencedirect.com/journal/performance-evaluation/about/call-for-papers#performance-analysis-and-evaluation-of-systems-for-artificial-intelligence | |||||||||
| |||||||||
Call For Papers | |||||||||
Artificial Intelligence and Machine Learning (AI/ML) applications are widely employed today in almost all sectors of society, ranging from everyday online applications (e.g., video streaming, e-commerce) to emerging tools for enterprises (e.g., large language models, inference serving platforms). AI/ML models that enable these applications are quickly growing in size, featuring billions of parameters. Consequently, the computer systems used to train, run, and serve predictions from these models, henceforth referred to as Systems for AI/ML, have high performance requirements and are expensive to procure and operate, both in terms of monetary cost and environmental impact. For example, training a large AI/ML model requires computer system resources (servers, GPUs, TPUs) that consume megawatt-hours of electricity and emit tons of greenhouse gasses.
There is thus an urgent need to study and optimize the performance of Systems for AI/ML, characterizing resource management and techniques for training and testing, their implications on accuracy requirements of the AI/ML models, and develop methods to effectively size and deploy these systems in production environments (e.g., at the edge). The performance analysis and evaluation of the resources and algorithms used to train, test, and operate an AI/ML system can provide a deeper understanding of the behavior and operation of such systems, allowing researchers to then develop solutions for optimizing resource efficiency. This special issue solicits unpublished works on performance analysis and evaluation research on the timely topic of Systems for AI/ML. The special issue will highlight novel approaches to analysis, modeling, and evaluation of Systems for AI/ML, as well as specific applications and emerging architectures designed for Systems for AI/ML. The special issue is intended for researchers, engineers, and practitioners who study and work on Systems for AI/ML, as well as those interested in performance analysis and modeling in general. We welcome submissions that study performance and resource management in Systems for AI/ML or those that present novel algorithms, techniques, or solutions to improve the efficiency and sustainability of Systems for AI/ML. While both theoretical and experimental approaches are welcome, attention will be paid in the review process on rigor and quantitative analysis. Works solely focused on improving classification or regression performance of AI/ML models (e.g., in terms of metrics such as accuracy, recall, F1 score) are outside the scope of this special issue. Papers are expected to demonstrate advances to performance analysis and/or resource management for the Systems for AI/ML underpinning training, testing, and operation of these models. Topics of interest for this special issue include, but are not limited to, the following: Techniques employed: Performance modeling, including probabilistic techniques, queueing models, and simulation Learning-based methods, including reinforcement learning and deep learning Performance troubleshooting using natural language processing or causal inference Application domains: AI/ML training infrastructure, such as accelerator-equipped (e.g., GPU, TPU) clusters Edge devices and clusters for AI/ML deployment Inference serving platforms Recommendation systems Video analytics services Embedded systems (e.g., TinyML, wearable devices, sensors) ML frameworks, such as PyTorch and TensorFlow Specific examples of submissions include, but are not limited to: Performance modeling of Systems for AI/ML Resource management and scheduling for AI/ML systems and services AI/ML system design in the presence of limited hardware resources Performance evaluation of the networking issues common in distributed AI/ML systems Root-cause analysis for performance diagnostics in Systems for AI/ML Methods for resource sizing and provisioning in tiny embedded systems for AI/ML (e.g., TinyML, smart wearable devices) Network and measurement approaches to characterize resource consumption of Systems for AI/ML Datasets and benchmarks for performance evaluation of Systems for AI/ML Sustainable Systems for AI/ML (focusing on consumption of power/energy/carbon/water or other environment-related metrics) |
|