| |||||||||||||||
HPAI4S 2025 : HPC for AI Foundation Models & LLMs for Science (Co-located with IEEE IPDPS) | |||||||||||||||
Link: https://sites.google.com/view/hpai4s/ | |||||||||||||||
| |||||||||||||||
Call For Papers | |||||||||||||||
==========
OVERVIEW: ========== Large Language Models (LLMs) have revolutionized the field of Natural Language Processing (NLP) by introducing powerful AI systems capable of understanding and generating human-like text with remarkable fluency and coherence. These models, trained on vast amounts of data, are capable of performing a wide range of tasks: language translation, text summarizing, knowledge distillation, enabling researchers to navigate complex scientific literature more efficiently. LLMs are only a starting point in unlocking the generative abilities of broader transformers to accelerate science: analyzing and plotting experimental data, formulating hypotheses, designing experiments, and even predicting promising research directions. To this end, modern transformers combine multi-modal data, leverage domain-specific representations, capture correlations through complex attention mechanisms (e.g., self-attention, cross-attention), compose specialized architectures (e.g., mixture of experts). They form the core of foundational models (FMs). The potential for architectural innovations has barely been tapped. In a quest for more emergent behavior (advanced capabilities not explicitly trained for but emerging spontaneously due to the massive scale and exposure to vast amounts of data during training), FMs' scale and complexity continuously increase, requiring larger training infrastructures and drastically escalating their energy footprint. In addition, the sharp increase in popularity of these models and the complexity of their prompting generates extreme volumes of concurrent inferences that need to be delivered at high throughput. A consequence of this trend for science applications is that FM training and inference face two major challenges. The first challenge is the democratization of FMs. The scale, cost, and time required to train FMs and run inferences is prohibitively expensive for small and medium institutions, both in academia and industry. The second challenge is the unprecedented scale, duration, and throughput of parallel executions required for training and inferences. This new context raises many open research problems and innovation opportunities for parallel computing. This workshop will provide the scientific community a dedicated forum for discussing new research, development, and deployment of FMs at scale. Specifically, it aims to address high performance, scalability and energy efficiency of FMs through a combination of system-level and algorithmic aspects such as processing and curating the training data, efficient parallelization techniques (data, tensor, pipeline parallelism, multi-level memory management, redundancy elimination, etc.), effective data reduction approaches (for parameters, activation, optimization, gradients), low-overhead checkpointing and strategies to survive model spikes and other anomalies, fine-tuning and continual learning strategies, comprehensive evaluation and benchmarking, efficient batching, scheduling and caching of inference requests to serve a large number of users concurrently, strategies for prompt engineering and augmentation (e.g., RAG), applications to domain sciences. ======= TOPICS: ======= We seek contributions that are related to the following topics. Papers that intersect with two or more topics are particularly encouraged. Model Exploration: Model distillation, ablation methods, and compression methods to experiment with new model architectures and multi-modal data at scale for path-finding or edge deployment. Data: Preparing Science data for use in AI models including but not limited to data reduction, sampling, filtering, curation, and deduplication. Pretraining: Efficient stall/failure detection and recovery mechanisms such as checkpoint and restart, novel parallelism approaches, and efficient data pipelines to be used in pre-training. Alignment, Fine-tuning, and Continual Fine-tuning: Re-reinforcement learning with human feedback, mitigating catastrophic forgetting. Evaluation: Tools and techniques for scaling human and automated evaluation for AI in science using HPC and cloud at scale. Inferences: Multi-tenancy and KV cache management, model instance management, batching of queries, balancing latency and throughput for inference. Multi-modality: Software/Hardware techniques to combine text with multi-modal science data, especially from large-scale scientific simulations and instruments to enable reasoning between domains. Reproducibility, Provenance, and Traceability: Tools, and approaches that enable tracking and reproducibility of experiments for AI. Systems Software (e.g. Compilers, Schedulers, Drivers, and core libraries), Hardware (e.g. Networks, Accelerators, GPUs), Theory Modeling, and algorithms, used in Networking, Computing, and Storage for AI for Science during Inference, Pretraining, Fine Tuning, Alignment, Continual Learning, and Evaluation. Other examples of AI for HPC and HPC for AI in scientific applications at scale. ================ IMPORTANT DATES: ================ Paper submission deadline: February 6th, 2025 AoE Initial notification: February 13th, 2025 AoE Revised submission deadline: February 18th, 2025 AoE Final notification: February 20th, 2025 AoE Camera-ready papers: March 6th, 2025 AoE ============= SUBMISSIONS: ============= Authors are invited to submit papers describing unpublished, original research. The workshop accepts full 8-page and short/work-in-progress 5-page papers, including references, figures, and tables. All manuscripts should be formatted using the IEEE conference-style template (https://www.ieee.org/conferences/publishing/templates.html) using 10-point font size on 8.5x11-inch pages. All papers must be in English. We use a single-blind reviewing process, so please keep the authors' names, publications, etc., in the text. Papers will be peer-reviewed and accepted papers will be published in the IPDPS workshop proceedings. All papers should be uploaded to https://ssl.linklings.net/conferences/ipdps/. |
|