Most popular benchmark repositories and open source projects

ClickBench ClickHouse HTML

ClickBench: a Benchmark For Analytical Databases

990 267 990

Monocular-Depth-Estimation-Toolbox zhyever Python

Monocular Depth Estimation Toolbox based on MMSegmentation.

967 112 967

moses molecularsets Python

Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models

966 274 966

AICGSecEval Tencent Python

A.S.E (AICGSecEval) is a repository-level AI-generated code security evaluation benchmark developed by Tencent Wukong Code Security Team.

963 105 963

KernelBench ScalingIntelligence Jupyter Notebook

KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)

946 158 946

grpc_bench LesnyRumcajs Dockerfile

Various gRPC benchmarks

935 149 935

blazehttp chaitin Go

BlazeHTTP 是一款简单易用的 WAF 防护效果测试工具。BlazeHTTP stands as a user-friendly WAF protection efficacy evaluation tool.

934 108 934

opencv_zoo opencv Python

Model Zoo For OpenCV DNN and Benchmarks.

934 285 934

agoo ohler55 C

A High Performance HTTP Server for Ruby

927 40 927

nench n-st Shell

VPS benchmark script — based on the popular bench.sh, plus CPU and ioping tests, and dual-stack IPv4 and v6 speedtests by default

914 115 914

s3-benchmark dvassallo Go

Measure Amazon S3's performance from any location.

910 144 910

AoE didi C++

AoE (AI on Edge，终端智能，边缘计算) 是一个终端侧AI集成运行时环境 (IRE)，帮助开发者提升效率。

887 133 887

IocPerformance danielpalme C#

Performance comparison of .NET IoC containers

885 154 885

mimic3-benchmarks YerevaNN Python

Python suite to construct benchmark machine learning datasets from the MIMIC-III 💊 clinical database.

881 344 881

InferenceX SemiAnalysisAI Python

Open Source Continuous Inference Benchmarking Qwen3.5, DeepSeek, GPTOSS - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 vs H100 & soon™ TPUv6e/v7/Train...

876 147 876

rl4co ai4co Python

A PyTorch library for all things Reinforcement Learning (RL) for Combinatorial Optimization (CO)

865 145 865

Celero DigitalInBlue C++

C++ Benchmark Authoring Library/Framework

860 98 860

nvbench NVIDIA Cuda

CUDA Kernel Benchmarking Library

856 104 856

CBLUE CBLUEbenchmark Python

[CBLUE1] 中文医疗信息处理基准CBLUE: A Chinese Biomedical Language Understanding Evaluation Benchmark

841 138 841

CrossPlatformDiskTest maxim-saplin C#

Windows, macOS and Android storage (HDD, SSD, RAM) speed testing/performance benchmarking app

836 46 836

human-learn koaning Jupyter Notebook

Natural Intelligence is still a pretty good idea.

832 56 832

huststore Qihoo360 C

High-performance Distributed Storage

830 173 830

bencher bencherdev MDX

🐰 Bencher - Continuous Benchmarking

826 43 826

WeatherBench pangeo-data Jupyter Notebook

A benchmark dataset for data-driven weather forecasting

824 177 824

typescript-runtime-type-benchmarks moltar TypeScript

📊 Benchmark Comparison of Packages with Runtime Validation and TypeScript Support

818 87 818

meta-dataset google-research Jupyter Notebook

A dataset of datasets for learning to learn from few examples

801 141 801

sbt-jmh sbt Scala

"Trust no one, bench everything." - sbt plugin for JMH (Java Microbenchmark Harness)

797 87 797

Programming-Language-Benchmarks hanabi1224 C#

Yet another implementation of computer language benchmarks game

794 163 794

http_bench linkxzhou Go

golang HTTP stress testing tool, support single and distributed, http/1, http/2 and http/3.

792 31 792

ISC-Bench wuyoscar Python

Internal Safety Collapse: Turning the LLM or an AI Agent into a sensitive data generator.

783 120 783

r3f-perf utsuboco TypeScript

Easily monitor your ThreeJS performances.

772 37 772

warp minio Go

S3 benchmarking tool

772 176 772

robustbench RobustBench Python

RobustBench: a standardized adversarial robustness benchmark [NeurIPS 2021 Benchmarks and Datasets Track]

772 105 772

HammerDB TPC-Council Tcl

HammerDB: The industry standard open-source database benchmark

752 145 752

caffenet-benchmark ducha-aiki Jupyter Notebook

Evaluation of the CNN design choices performance on ImageNet-2012.

743 150 743

OpenCUA xlang-ai Python

OpenCUA: Open Foundations for Computer-Use Agents

740 97 740

tape songlab-cal Python

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of prote...

738 135 738