What is the AI.Panther cluster?

The AI.Panther HPC Cluster at Florida Tech is a Aspen Systems cluster, comprised of 16 compute nodes (Total of 768 processor cores and 6,144 GB RAM), 12 GPU nodes, 3 storage nodes, 1 login node, 1 head node, and a DDN ExaScaler storage system. Eight of the GPU nodes contain 4 NVIDIA A100 GPUs each, and four of the GPU nodes contain 8 NVIDIA H200 GPUs each. The AI.Panther cluster was initially funded by the National Science Foundation major Research Implementation grant and expanded through a NIST ASCEND grant.

A HPC Cluster is a set of computers working in parallel performing similar to a supercomputer for a fraction of the price. The HPC Cluster is made up of a cluster of nodes connected by a high-speed network that perform intense computing tasks.

To learn about High Performance Computing or Supercomputers, see Wikipedia.

Hardware

Head / Login Nodes

  • 4 x Intel Xeon Cascade Lake Silver 4210R, 2.4 GHz 10-Core CPUs

  • 12 x 8 = 96GB RAM

  • 4 x 960GB Enterprise SSD 

 A100 PCIe GPU Nodes

  • 4 x AMD EPYC 7402P Rome, 2.8 GHz 24-Core CPUs

  • 32 x 32 = 1024GB RAM

  • 4 x 960GB Enterprise SSD

  • 16 x NVIDIA Tesla Ampere A100 40GB Memory, PCIe

  • 24 x A100 NVLink Bridge

A100 SXM4 GPU Nodes

  • 8 x AMD EPYC 7402 Rome, 2.8 GHz 24-Core CPUs

  • 64 x 32 = 2048GB RAM

  • 4 x 960GB Enterprise SSD

  • 16 x NVIDIA Tesla Ampere A100 40GB Memory, 4 baseboards with 4 A100s with NVLINK

H200 SXM GPU Nodes 

  • 4 x Dell PowerEdge XE9680 GPU compute nodes

  • 2 x Intel Xeon Platinum 8562Y+, 32-Core CPUs

  • 32 x 96 = 3072GB RAM

  • 1 x  Dell Boss-N1 NVMe (960 GB usable, OS device)

  • 4 x Dell NVMe ISE PS1010 RI U.2 3.84 TB (node-local NVMe)

  • 32 x NVIDIA H200 141GB Memory, 4 nodes with 8 H200s with NVLINK

High Memory Compute Nodes

  • 32 x Intel Xeon Cascade Lake Gold 6240R, 2.4GHz, 24-Core CPUs

  • 192 x 32 = 6144GB RAM

  • 16 x 960GB Enterprise SSD

ZFS User/Home Fileserver (78TB after overhead) - /home1

  • 2 x Intel Xeon Cascade Lake Silver 4215R, 3.20GHz, 8-Core CPUs

  • 12 x 16 = 192GB RAM

  • 2 x 240GB Enterprise SSD 

  • 8 x 14TB SAS HDD configured as RAIDZ2

  • 2 x P4800X 375GB Optane SSDs 

ZFS Archive Fileserver (420TB after overhead) - /archive

  • 2 x Intel Xeon Cascade Lake Silver 4215R, 3.20GHz, 8-Core CPUs 

  • 12 x 32 = 384GB RAM

  • 2 x 240GB Enterprise SSD 

  • 36 x 16TB SATA HDD configured as four RAIDZ2 arrays

  • 2 x P4800X 375GB Optane SSDs 

DDN AI /HPC Storage System (Project/Shared Scratch) - /shared

  • DDN AI400X2 All Flash Appliance Bundle

  • 24 x NVMe slots with 2x SE2420 enclosures (48 x NVMe slots)

  • 8 x NDR200 / 200GbE QSFP112 ports

  • ~1 PiB usable capacity

Network Switches

  • One NVIDIA NDR Infiniband switch (MQM9700-NS2R) for H200 GPU Nodes and DDN Storage

  • One Mellanox/NVIDIA HDR IB switch for CPU and A100 GPU Nodes

  • One managed 1GbE switch

  • One 25Gb (18 port)/100Gb (4 port) Ethernet switch 

  • Nodes have 25Gb Ethernet connection to the 25GbE switch (Dual Port 1GbE with IPMI, HDR100 IB Connection). 100GbE links from 25GbE switch are available for future expansion