Supermicro Sets New STAC-ML Markets Inference Benchmark Records With NVIDIA
Joint Innovation Delivers Industry-Leading Performance for AI Inference in Financial Markets
San Jose, CA – Supermicro, in collaboration with NVIDIA, has achieved breakthrough performance results in STAC-ML™ Markets (Inference) benchmarks, demonstrating the power of combining Supermicro's advanced server architecture with NVIDIA's cutting-edge AI platform.
Setting New Industry Inference Performance Standard
Compared to the previous FPGA-based record, the NVIDIA GH200 Grace Hopper Superchip in a Supermicro ARS-111GL-NHR server delivered:
- Up to 49% lower latency on large models
- 44% higher energy efficiency
- 8–13× lower inference error rates
- Latency as low as 4.67 µs (99th percentile)
These results establish a new benchmark for AI inference performance in financial markets, demonstrating exceptional speed, accuracy, and efficiency in a single, integrated solution.
Record-Breaking Performance Details
STAC recently completed an audited benchmark of the Supermicro ARS-111GL-NHR server powered by the NVIDIA GH200 Grace Hopper™ Superchip. This collaboration between Supermicro and NVIDIA delivered exceptional results across all model sizes tested, setting new standards for AI inference performance in financial markets.
The joint solution demonstrated remarkable improvements compared to previous FPGA-based systems:
LSTM_A (Small Model)
- Up to 20% lower latency with 8 model instances (4.67μs vs 5.97μs)
- 8x better accuracy with 99th percentile error of 0.00111 vs 0.00889
LSTM_B (Medium Model)
- Up to 8% lower latency with 4 model instances (7.10μs vs 7.73μs)
- 12x better accuracy with 99th percentile error of 0.00102 vs 0.0127
LSTM_C (Large Model)
- 49% lower latency (15.8μs vs 31.0μs)
- 15% higher throughput (3,910 vs 3,387)
- 44% better energy efficiency (8,312 vs 5,785)
- 13x better accuracy with 99th percentile error of 0.00172 vs 0.0237
The Power of Collaboration
This achievement showcases the powerful performance of Supermicro's optimized server design and the NVIDIA GH200 Grace Hopper Superchip architecture. The NVIDIA GH200 Grace Hopper™ Superchip combines the NVIDIA Grace™ CPU and NVIDIA Hopper GPU in a single superchip, delivering unprecedented memory bandwidth and energy efficiency for AI Inference workloads.
The combination of the Supermicro ARS-111GL-NHR platform and the NVIDIA GH200 Grace Hopper Superchip represents a significant leap forward for financial institutions deploying AI inference solutions, demonstrating both companies' commitment to delivering the highest-performance solutions for mission-critical applications.
Built for Financial Markets
STAC-ML Markets (Inference) is the technology benchmark standard for solutions used to run inference on real-time market data. Designed by quants and technologists from leading financial firms, the benchmarks evaluate latency, throughput, energy efficiency, space efficiency, and algorithm quality across various model configurations.
The Supermicro ARS-111GL-NHR with NVIDIA GH200 Grace Hopper Superchip excels in all these dimensions, making it an ideal platform for:
- Real-time trading algorithms
- Risk analysis and portfolio optimization
- Market prediction models
- High-frequency trading strategies
- Quantitative research workloads
Technical Specifications
Supermicro ARS-111GL-NHR Server
- Purpose-built for NVIDIA GH200 Grace Hopper Superchip
- Optimized thermal design for sustained performance
- Advanced power management for maximum efficiency
- Industry-leading reliability and serviceability
NVIDIA GH200 Grace Hopper™ Superchip
- NVIDIA Grace CPU with Arm®-based cores
- NVIDIA Hopper GPU architecture
- 900GB/s coherent memory bandwidth
- Unified memory architecture
Availability
The Supermicro ARS-111GL-NHR with NVIDIA GH200 Grace Hopper Superchip is available now. For detailed configuration information and to learn more about how this solution can accelerate your AI inference workloads, contact your Supermicro sales representative or visit supermicro.com.
Full STAC benchmark reports are available to STAC Observer members at docs.stacresearch.com/SMC250910.
Recent Posts
Subscribe to Data Center Stories
By clicking subscribe, you consent to allow Supermicro to store and process the personal information submitted above to provide you the content requested.
You can unsubscribe from these communications at any time. For more information on how to unsubscribe, our privacy practices, and how we are committed to protecting and respecting your privacy, please review our Privacy Policy.