Topic

benchmark

Repositories (1763)

dbtester
dbtester etcd-io Go

Distributed database benchmark tester

290
dnspyre
dnspyre Tantalor93 Go

CLI tool for a high QPS DNS benchmark

290
Turbo-Run-Length-Encoding
Turbo-Run-Length-Encoding powturbo C

TurboRLE-Fastest Run Length Encoding

289
java-object-mapper-benchmark
java-object-mapper-benchmark arey Java

JMH benchmark of Java object-to-object mapping frameworks

288
HeCBench
HeCBench ORNL C++
286
awesome-diffusion-v2v
awesome-diffusion-v2v wenhao728 Python

Awesome diffusion Video-to-Video (V2V). A collection of paper on diffusion model-based video editing, aka. video-to-video (V2V) translation. And a vid...

285
physics-IQ-benchmark
physics-IQ-benchmark google-deepmind Python

Benchmarking physical understanding in generative video models

285
pantheon
pantheon StanfordSNR Python

Pantheon of Congestion Control

284
RHEL7-STIG
RHEL7-STIG ansible-lockdown YAML

Automated STIG Benchmark Compliance Remediation for RHEL 7 with Ansible

283
benchexec
benchexec sosy-lab Python

BenchExec: A Framework for Reliable Benchmarking and Resource Measurement

281
STPLS3D
STPLS3D meidachen Python

🔥 Synthetic and real-world 2d/3d dataset for semantic and instance segmentation (BMVC 2022 Oral)

281
tf-metal-experiments
tf-metal-experiments tlkh Jupyter Notebook

TensorFlow Metal Backend on Apple Silicon Experiments (just for fun)

280
goTemplateBenchmark
goTemplateBenchmark slinso Go

comparing the performance of different template engines

277
web-bench
web-bench bytedance JavaScript

Web-Bench is a benchmark designed to evaluate the performance of LLMs in actual Web development.

272
hand_pose_action
hand_pose_action guiggh Python

Dataset and code for the paper "First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations", CVPR 2018.

271
ProteinWorkshop
ProteinWorkshop a-r-j Python

Benchmarking framework for protein representation learning. Includes a large number of pre-training and downstream task datasets, models and training/...

271
CFDBench
CFDBench luo-yining Python

A large-scale benchmark for machine learning methods in fluid dynamics

271
pgbent
pgbent gregs1104 Shell

PostgreSQL Benchmarking Toolkit

269
graphql-bench
graphql-bench hasura TSQL

A super simple tool to benchmark GraphQL queries

269
SOTA-on-monocular-3D-pose-and-shape-estimation
SOTA-on-monocular-3D-pose-and-shape-estimation Arthur151 Python

State-of-the-art methods on monocular 3D pose estimation / 3D mesh recovery

269
DS-1000
DS-1000 xlang-ai Python

[ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".

269
DeepFund
DeepFund HKUSTDial Python

🔥[NeurIPS'25] DeepFund: Pilot for Your Next Fund Investment

268
WorldScore
WorldScore haoyi-duan Python

Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation

267
elbencho
elbencho breuner C++

A distributed storage benchmark for file systems, object stores & block devices with support for GPUs

266
picows
picows tarasko Python

Ultra-fast websocket client and server for asyncio

265
uVkCompute
uVkCompute google C++

A micro Vulkan compute pipeline and a collection of benchmarking compute shaders

264
VideoGen-Eval
VideoGen-Eval AILab-CVC

VideoGen-Eval: Agent-based System for Video Generation Evaluation

262
MedAgentBench
MedAgentBench stanfordmlgroup Python

MedAgentBench: A Realistic Virtual EHR Environment to Benchmark Medical LLM Agents

259
myclaw-bench
myclaw-bench LeoYeAI Python

The definitive benchmark for AI agents on OpenClaw. 45 tasks across 4 tiers. Powered by MyClaw.ai

259
Large-Scale-Medical
Large-Scale-Medical Luffy03 Python

[TPAMI 2026] Large-Scale 3D Medical Image Pre-training with Geometric Context Priors

258
app-servers
app-servers costajob Elixir

App Servers benchmarked for: Ruby, Python, JavaScript, Dart, Elixir, Java, Crystal, Nim, GO, Rust

257
imdbench
imdbench geldata Python

IMDBench — Realistic ORM benchmarking

257
MixEval
MixEval JinjieNi Python

The official evaluation suite and dynamic data release for MixEval.

256
foundation-model-benchmarking-tool
foundation-model-benchmarking-tool aws-samples Jupyter Notebook

Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.

255
dbench
dbench leeliu Shell

Benchmark Kubernetes persistent disk volumes with fio: Read/write IOPS, bandwidth MB/s and latency

255
ubench.h
ubench.h sheredom C++

⏱️ single header benchmark framework for C and C++

255
TSB-AD
TSB-AD thedatumorg Python

Time-Series Anomaly Detection | Algorithms + Datasets + Tutorials

255
blue-team
blue-team maldevel Shell

Blue Team Scripts

254
OpenXAI
OpenXAI AI4LIFE-GROUP JavaScript

OpenXAI : Towards a Transparent Evaluation of Model Explanations

254
tracerbench
tracerbench TracerBench TypeScript

Automated Chrome tracing for benchmarking.

252
deep-visual-geo-localization-benchmark
deep-visual-geo-localization-benchmark gmberton Python

Official code for CVPR 2022 (Oral) paper "Deep Visual Geo-localization Benchmark"

251
gungraun
gungraun gungraun Rust

High-precision, one-shot and consistent benchmarking framework/harness for Rust. All Valgrind tools at your fingertips.

250
UBUNTU22-CIS
UBUNTU22-CIS ansible-lockdown YAML

Automated CIS Benchmark Compliance Remediation for Ubuntu 22 with Ansible

250
lex-glue
lex-glue coastalcph Python

LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

250
llm-benchmark
llm-benchmark lework Python

LLM 并发性能测试工具,支持自动化压力测试和性能报告生成。

250
MLVU
MLVU JUNJIE99 Python

🔥🔥MLVU: Multi-task Long Video Understanding Benchmark

249
benchmark-memory
benchmark-memory michaelherold Ruby

Memory profiling benchmark style, for Ruby 2.1+

249
BALROG
BALROG balrog-ai Python

Benchmarking Agentic LLM and VLM Reasoning On Games

247
YACCLAB
YACCLAB prittt C

YACCLAB: Yet Another Connected Components Labeling Benchmark

246
leaky-repo
leaky-repo Plazmaz Python

Benchmarking repo for secrets scanning

245