Most popular benchmark repositories and open source projects

training

Reference implementations of MLPerf™ training benchmarks

570   1676   1676  

tapnet

Tracking Any Point (TAP)

151   1607   1607  

fastRAG

Efficient Retrieval Augmentation and Generation Framework

145   1565   1565  

nanobench

Simple, fast, accurate single-header microbenchmarking functionality f...

90   1554   1554  

Awesome-LLM-Long-Context-Modeling

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

58   1530   1530  

LLM-eval-survey

The official GitHub page for the survey paper "A Survey on Evaluation...

96   1530   1530  

evalplus

Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 202...

157   1475   1475  

py-motmetrics

:bar_chart: Benchmark multiple object trackers (MOT) in Python

260   1441   1441  

llm-colosseum

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluat...

173   1429   1429  

inference

Reference implementations of MLPerf™ inference benchmarks

549   1391   1391  

pytest-benchmark

pytest fixture for benchmarking code

121   1340   1340  

jsperf.com

jsperf.com v2. https://github.com/h5bp/lazyweb-requests/issues/174

128   1315   1315  

SLM-Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion li...

274   1283   1283  

Attabench

Microbenchmarking app for Swift with nice log-log plots

46   1281   1281  

FastExpressionCompiler

Fast Compiler for C# Expression Trees and the lightweight LightExpress...

88   1278   1278  

smac

SMAC: The StarCraft Multi-Agent Challenge

233   1213   1213  

boomer

A better load generator for locust, written in golang.

241   1208   1208  

MedMNIST

[pip install medmnist] 18x Standardized Datasets for 2D and 3D Biomedi...

182   1200   1200  

bench-scripts

A compilation of Linux server benchmarking scripts.

169   1186   1186  

appdocs

Application Performance Optimization Summary

466   1174   1174  

divan

Fast and simple benchmarking for Rust projects

32   1154   1154  

TBCF

Tracking Benchmark for Correlation Filters

328   1146   1146  

flow

Computational framework for reinforcement learning in traffic control

386   1119   1119  

github-action-benchmark

GitHub Action for continuous benchmarking to keep performance

169   1118   1118  

Awesome-System2-Reasoning-LLM

Latest Advances on System-2 Reasoning

48   1051   1051  

php-framework-benchmark

PHP Framework Benchmark

161   1027   1027  

VBench

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

58   1024   1024  

primesieve

🚀 Fast prime number generator

124   1014   1014  

ut

UT: C++20 μ(micro)/Unit Testing Framework

86   1007   1007  

memtier_benchmark

NoSQL Redis and Memcache traffic generation and benchmarking tool.

233   967   967  

lzbench

lzbench is an in-memory benchmark of open-source compressors

195   962   962  

omnisafe

JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL...

135   958   958  

benchmark

TorchBench is a collection of open source benchmarks used to evaluate...

310   947   947  

ADBench

Official Implement of "ADBench: Anomaly Detection Benchmark", NeurIPS...

141   940   940  

Monocular-Depth-Estimation-Toolbox

Monocular Depth Estimation Toolbox based on MMSegmentation.

108   938   938  

asv

Airspeed Velocity: A simple Python benchmarking tool with web-based re...

190   927   927  

OpenSTL

OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learn...

151   925   925  

RoboTwin

[CVPR 25 Highlight & ECCV Workshop 24 Best Paper] RoboTwin Dual-arm Ro...

106   924   924  

agoo

A High Performance HTTP Server for Ruby

40   917   917  

pyperformance

Python Performance Benchmark Suite

188   914   914  

grpc_bench

Various gRPC benchmarks

144   914   914  

PDEBench

PDEBench: An Extensive Benchmark for Scientific Machine Learning

108   907   907  

LongBench

LongBench v2 and LongBench (ACL 2024)

87   892   892  

moses

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generati...

257   890   890  

nench

VPS benchmark script — based on the popular bench.sh, plus CPU and iop...

116   888   888  

AoE

AoE (AI on Edge,终端智能,边缘计算) 是一个终端侧AI集成运行时环境 (IRE...

133   885   885  

IocPerformance

Performance comparison of .NET IoC containers

157   884   884  

s3-benchmark

Measure Amazon S3's performance from any location.

136   857   857  

Celero

C++ Benchmark Authoring Library/Framework

98   847   847  

mimic3-benchmarks

Python suite to construct benchmark machine learning datasets from the...

338   837   837