New NVIDIA H200 GPU Node Coming to Afton

We are excited to announce the expansion of UVA’s AI computing capabilities. On April 3, we are adding a NVIDIA HGX H200 GPU node to the Afton high-performance computing (HPC) cluster, the first of many planned for the coming year. Each node provides 2TB of node CPU memory and 8-way connected Tensor Core GPUs with 141GB of VRAM memory per device. These devices offer higher performance than the current Afton GPU nodes, opening new possibilities for the most challenging deep learning and large language model computations.

Hot can I get access to the new hardware?

On April 3, the new GPU nodes will become available as a beta-release. During this phase the new nodes will be accessible for all users with active Afton HPC allocations. Please follow these instructions to use the HGX H200 nodes in your jobs. Jobs running on the new hardware will consume service units based on charge rates reflective of the actual hardware and service cost. Visit our webpage for a complete list of service unit charge rates here. Please be aware that HGX H200 nodes may be removed from service on short notice outside of scheduled maintenance windows in case configuration adjustments need to be made as the new hardware operates at scale.

On May 1, the new HGX H200 node will be released into full production. Access and service unit charge rates remain in effect as posted. Additional hardware upgrades or configuration updates, if needed, will be handled as part of Afton’s pre-announced regular maintenance cycles.

Where can I learn more?

More detailed descriptions of the new hardware’s capabilities and how to use it are available on our website. In addition, you may reach out to our User Services team during virtual office hours or by submitting a support request.