In order to meet the demands of high performance computing (HPC) researchers, large-scale computational and storage machines require many staff members who design, install, and maintain these systems. These HPC systems professionals include system engineers, system administrators, network administrators, storage administrators and operations staff all who face problems that are specific to high performance systems.
The Systems Professionals Workshop intends to be a platform for discussing the unique challenges that come from supporting large-scale, high performance systems. We are soliciting submissions that speak directly to the state of the practice of standing up and operating high performance systems with an emphasis on solutions that can be implemented by systems staff at other institutions.
Here are some topics of interest for this group. Note that these are here to indicate direction, not to disallow other related topics.
-Cluster, configuration, or software management
-Performance tuning/Benchmarking
-Resource manager and job scheduler configuration
-Monitoring/Mean-time-to-failure/ROI/Resource utilization
-Virtualization/Clouds
-Designing and troubleshooting HPC interconnects
-Designing and maintaining HPC storage solutions
-Cybersecurity and data protection
-Cluster storage
Example paper ideas might be:
-Best practices for job scheduler configuration
-Advantages of cluster automation
-Managing software on HPC clusters