HPC and AI on Arm architecture: Observations, learnings and path forward


This workshop is aimed at providing a “birds-eye” view of the latest contributions and progress in industry and academia for furthering Arm architecture adoption for HPC and AI. Specifically, it is important to highlight several systems that are available today based on Fujitsu’s A64FX, AWS Graviton3, Ampere Altra processors. In addition, the Arm HPC/ AI ecosystem has mature software tools and applications. New designs based on Arm architecture such as “Rhea” CPU by SiPearl, ETRI and “Grace” CPU by Nvidia reinforce the value proposition of Arm for HPC/ AI.


In this workshop, we will have industry experts/ users highlight some of the experiences using an Arm based system for HPC/ AI problem solving. We will have performance results comparing different architectures and systems, portability concerns, performance analysis tools and compiler code-gen/ maturity. We will cover case-studies optimizing applications for Arm, NEON/ SVE vs. AVX vectorization, amongst several other topics. In addition, we plan to have a panel discussion on the workshop theme.


Tentative Outline on Monday 18 Dec, 2023





2:00 pm

Openings remarks

5 mins

2:05 pm

Talk 1: Riken

(Topic to be confirmed)

35 mins

2:40 pm

Talk 2: Arm

Architecture innovation and future of Arm processors

25 mins

3:05 pm

Talk 3: AWS
Deep-dive into AWS Graviton processors

25 mins

3:30 pm


15 mins

3:45 pm

Talk 4: Fujitsu

(Topic to be confirmed)

25 mins

4:10 pm

Talk 5: Ampere

Designing Arm processors, choices and challenges

25 mins

4:35 pm

Panel session: Designing future machines for HPC and AI: A perspective from industry, academia

25 mins




  • Experiences on GPU-Accelerated Arm-based HPC Platforms.
  • Adoption of ARM in the HPC community: Benchmarks and Applications.
  • HPC on Arm based offerings in the cloud: AWS Graviton, Ampere Altra, others.
  • Performance comparison of Open-source HPC apps on Arm vs. x86 architecture.
  • Machine Learning and AI on Arm Servers: Libraries, BF16, optimized ML models.
  • Porting to Arm, usage of intrinsic functions: real world experiences.
  • Tools availability for porting and performance tuning on Arm.
  • SVE and compiler auto-vectorization opportunities.
  • HPC and AI application case-studies.


Organizers, Speakers

Workshop Chair, Co-Chair:

Rama Malladi, Amazon Web Services, India

Bhagyaraju Kasina/ Santosh Kumar, Amazon Web Services, India


Tentative Speakers

Ashok Bhat, Arm, India

Sanjay Tiwary/ Bhagyaraju Kasina/ Santosh Kumar, Amazon Web Services, India

Vinod Kumar, Ampere, India

Priyanka Sharma, Fujitsu, India