ResCuE-HPC 2018

计算机体系结构,并行与分布式计算

ResCuE-HPC 2018

Workshop on Reproducible, Customizable and Portable Workflows for HPC

摘要截稿:

全文截稿: 2018-08-24

开会时间: 2018-11-11

会议难度:

CCF分类: 无

会议地点: Dallas, TX, USA

Overview

Reproducibility is critical for the scientific process. Sharing the artifacts, data, and workflows associated with papers forces authors to turn a more careful eye to their own work, and it enables other scientists to easily validate and build on prior work. Over the past five years, many top-tier parallel computing conferences have established Artifact Evaluation (AE) initiatives. The community has bought in, and nearly half of accepted papers now include artifacts.

Unfortunately, recent attempts to reproduce experimental results show that many challenges still remain (AE slides, ACM ReQuEST report). A lack of common tools, along with increasingly deep stacks of dependencies, lead to ad-hoc workflows, and evaluators struggle to install, run, and analyze experiments. These challenges are not unique to artifact evaluation. Users of production simulation codes struggle to reproduce complex workflows, even on the same machine. Benchmark suites are notoriously complex to configure and work with, and reproducing their performance can be a daunting task. Indeed, nearly all shared artifacts still require manual steps and human intuition, which ultimately makes them difficult to customize, port, reuse, and build upon.

ResCuE-HPC will bring together HPC researchers and practitioners to propose and discuss ways to enable reproducible, portable and customizable experimental workflows for HPC. We are interested in contributions that describe state-of-the-art and pitfalls for reproducibility, as well as improvements to existing frameworks, benchmarks and datasets that can be used to run HPC workloads across multiple software versions and hardware architectures. Ultimately, we aim to automate artifact evaluation, benchmarking, and workflows with a common co-design framework, and collaboratively solve reproducibility issues in HPC.

We invite position papers of up to 4 pages presenting novel or existing practical solutions to:
-automate and unify artifact evaluation at HPC conferences;
-share artifacts (workloads, benchmarks, data sets, models, tools), workflows and experiments in a portable, customizable, and reusable format;
-automatically and natively install and rebuild all software dependencies required for shared experimental workflows on different machines and environments;
-automatically report and visualize experimental results including interactive articles to assist reproducible initiative at SC and other conferences and journals;
-continuously validate experiments from past research and report/record unexpected behavior (bugs, numerical instability, variation of empirical results such as execution time or energy measurements, etc) on new and evolving software and hardware stack;
-establish open repositories of common benchmarks, data sets and tools to accelerate knowledge exchange between HPC centers;
enable universal, customizable and multi-objective auto-tuning and co-design of HPC software and hardware in a reproducible and reusable way;
-unify statistical analysis and predictive modeling techniques to improve reproducibility of empirical experimental results.

We also encourage submissions demonstrating practical use-cases of portable, customizable and reusable HPC workflows by connecting together existing tools including but not limited to Spack, Collective Knowledge, EasyBuild, Common Workflow Language and many others.