CuPerf
2026-01-01· 1 min read· 77 words
A modern, extensible command-line tool for benchmarking GPU performance on NVIDIA CUDA devices.
A modern, extensible command-line tool for benchmarking GPU performance on NVIDIA CUDA devices.
Features
- Provides accurate, reproducible measurements of memory bandwidth, compute throughput, tensor core performance, kernel launch overhead, and reduction performance
- Supports multiple data types (FP32, FP16, BF16, INT8, FP4)
- Comprehensive statistics and multiple output formats (console, JSON, CSV)
Technologies
CUDA, C++, Parallel Computing, Profiling
Links
Status
Active Development - Jan 2026 - Present
See Also
- Resume - See this project in my resume