OpenACC Birds of a Feather (BOF) at SC21

OpenACC is dedicated to helping the developer community advance by expanding their accelerated parallel computing skills, and supports a directive-based, high-level programming model designed to provide an easy on-ramp to accelerated computing on CPUs, GPUs and other devices. OpenACC supports over 25 global hackathons each year and the user-friendly programming model has facilitated acceleration of over 200 applications on multiple platforms, e.g., Perlmutter, Summit, Sunway Taihulight, and Piz Daint. This BoF invites scientists, programmers and researchers to discuss their experiences in adopting OpenACC for scientific applications, learn about the roadmaps from implementers and the latest developments in the language specification.

OpenACC User Experience, Hackathons, Vendor Reaction, Relevance, and Roadmap

Agenda

Welcome and OpenACC Organizational Update, Jack Wells, President of OpenACC Video PDF
OpenACC Specification Update, Jeff Larkin, Technical Committee Co-Chair Video PDF
Compiler Implementations
- HPE Updates: John Levesque, HPE Video PDF
- GCC Updates: Thomas Schwinge, Siemens Video PDF
Porting Scientific Applications with OpenACC: Real-world Use Cases
- On the Road to Code Portability, Stéphane Ethier, Princeton Plasma Physics Laboratory (PPPL) Video
  PPPL scientists have successfully ported several codes to NVIDIA GPUs using the OpenACC programming model. It has been our preferred approach due to its ease of implementation and non-interference with the CPU code. While the hope was that OpenACC would become the de facto directive-based programming model for accelerators of all types, it appears that OpenMP is now being promoted more forcefully. Unfortunately, this transition period can be painful for developers who have to work with immature implementations.
- Can Fortran’s ‘do concurrent’ Replace Directives for Accelerated Computing? Ron Caplan, Predictive Science Video PDF
  Recently, there has been growing interest in using standard language constructs (e.g. C++’s Parallel Algorithms and Fortran’s `do concurrent`) for accelerated computing as an alternative to directive-based APIs (e.g. OpenMP and OpenACC). These constructs have the potential to be more portable, and some compilers already (or have plans to) support such standards. Here, we look at the current capabilities, portability, and performance of replacing directives with Fortran’s `do concurrent` using a mini-app that currently implements OpenACC for GPU-acceleration and OpenMP for multi-core CPU parallelism. We replace as many directives as possible with `do concurrent`, testing various configurations and compiler options within three major compilers: GNU’s gfortran, NVIDIA’s nvfortran, and Intel’s ifort. We find that with the right compiler versions and flags, many directives can be replaced without loss of performance or portability, and, in the case of nvfortran, they can all be replaced. We discuss limitations that may apply to more complicated codes and future language additions that may mitigate them. Singularity containers are publicly provided to allow the results to be reproduced.
Training and Education, Julia Levites, Marketing Committee Chair Video