Most popular benchmark repositories and open source projects

PIMeval-PIMbench UVA-LavaLab C++

PIMeval simulator and PIMbench suite

47 32 47

LearnAct lgy0404 Python

[ACL 2026 Findings] Official code repo for the paper "LearnAct: Few-Shot Mobile GUI Agent with a Unified Demonstration Benchmark"

47 0 47

ui-design-bench SunkenInTime TypeScript

47 4 47

MIPHEI-ViT Sanofi-Public Python

MIPHEI-ViT: Repository to train Image-to-Image H&E to Immunofluorescence models

47 9 47

wordpress-speedtest szepeviktor PHP

VPS Speedtest for WordPress with 160 results: 🏆 UpCloud (raw memory and CPU benchmark)

46 14 46

RediSearchBenchmark RediSearch Go

Benchmarks for the RediSearch module

46 8 46

bots bsc-pm C

Barcelona OpenMP Task Suite is a collection of applications that allow to test OpenMP tasking implementations and compare its behaviour under certain...

46 18 46

weblink PXshadow Haxe

Linking Haxe to the role of a web server

46 7 46

api-performance-tests litestar-org Python

Benchmarking Litestar vs other ASGI API framework

46 8 46

ReForm-Eval FudanDISC Python

An benchmark for evaluating the capabilities of large vision-language models (LVLMs)

46 3 46

scandinavian-embedding-benchmark KennethEnevoldsen Python

A Scandinavian Benchmark for sentence embeddings

46 9 46

benchmark-privesc-linux ipa-lab Shell

A comprehensive local Linux Privilege-Escalation Benchmark

46 9 46

PQC-LEO crt26 Shell

PQC-LEO is a comprehensive benchmarking and evaluation framework for Post-Quantum Cryptography (PQC), built for researchers. Automates the setup, test...

46 18 46

GameVerse THUSI-Lab Python

GameVerse: Can Vision-Language Models Learn from Video-based Reflection?

46 0 46

variantbenchmarking nf-core Nextflow

Pipeline to evaluate and validate the accuracy of variant calling methods in genomic research

46 27 46

cloud-workbench sealuzh Ruby

Cloud WorkBench (CWB) is a web-based framework that is grounded on the notion of Infrastructure-as-Code (IaC) to foster simple definition, execution,...

45 6 45

jmeter-grpc-plugin zalopay-oss Java

A JMeter plugin supports load test gRPC

45 24 45

dammmdatagen DoktorMike R

Marketing Mix Modeling Data Generator

45 8 45

mdl-stance-robustness UKPLab Python

Multi-dataset stance detection and robustness experiments

45 11 45

wasm-coremark wasm3 HTML

CoreMark 1.0 ported to WebAssembly

45 3 45

vroom-scripts VROOM-Project Python

45 22 45

glassbench Canop Rust

A micro-benchmark framework to use with cargo bench

45 3 45

imp_marl moratodpg Python

IMP-MARL: a Suite of Environments for Large-scale Infrastructure Management Planning via MARL

45 6 45

thunder MICS-Lab Python

[NeurIPS25 D&B Spotlight] A tile-level histopathology image understanding benchmark

45 14 45

BM-code ByteDance-Seed Python

[Arxiv 2025] ByteMorph: Benchmarking Instruction-Guided Image Editing with Non-Rigid Motions

45 1 45

ML4CO-Bench-101 Thinklab-SJTU Python

ML4CO-Bench-101: Benchmark Machine Learning for Classic Combinatorial Problems on Graphs.

45 4 45

loom-webflux-benchmarks chrisgleissner Python

Benchmarks of Spring Boot REST service comparing Java 21 Virtual Threads (Project Loom) with WebFlux (Project Reactor).

45 4 45

MCPToolBenchPP mcp-tool-bench Python

MCPToolBench++ MCP Model Context Protocol Tool Use Benchmark on AI Agent and Model Tool Use Ability

45 9 45

skill-optimizer fastxyz TypeScript

Benchmark and self-optimize SDK/CLI/MCP guidance so every agent model can use your tool reliably.

45 7 45

OrmBenchmark InfoTechBridge C#

ORM Benchmarking

44 10 44

node-red-contrib-actionflows Steveorevo JavaScript

Provides a set of nodes to enable an extendable design pattern for flows.

44 12 44

rop-benchmark ispras Python

ROP Benchmark is a tool to compare ROP compilers

44 7 44

Fuzzle SoftSec-KAIST Python

Fuzzle: Making a Puzzle for Fuzzers (ASE'22)

44 8 44

grade influxdata Go

Track Go benchmark performance over time by storing results in InfluxDB

43 3 43

benchmarkme csgillespie R

Crowd sourced benchmarking

43 16 43

extended-berkeley-segmentation-benchmark davidstutz C++

Extended version of the Berkeley Segmentation Benchmark [1] used for evaluation in [2].

43 19 43

Unchase.FluentPerformanceMeter unchase C#

:hammer: Make the exact performance measurements of the public methods for public classes using this NuGet Package with fluent interface. Requires .Ne...

43 3 43

flo-shani-aesni armfazh C

Performance Evaluation of SHA-256 using SHA New Instructions.

43 11 43

eCommerceSearchBench alibaba Jupyter Notebook

E-commerce search benchmark is the first end-to-end application benchmark for e-commerce search system with personalized recommendations.This work is...

43 20 43

npm-yarn-benchmark artberri Shell

Bash script for comparing NPM and Yarn performance

43 13 43

llm100kbench gqgs Go

LLM 100k portfolio management benchmark

43 0 43

WfCommons wfcommons Python

WFCommons: A Framework for Enabling Scientific Workflow Research and Development

43 14 43

http-benchmark-netty junneyang Java

基于Java Netty的HTTP客户端工具 & HTTP高性能测试工具。参数灵活定制、支持邮件报表等。Python Tornado版: https://github.com/junneyang/http-benchmark-torna...

42 33 42

Federated-Benchmark jiahuanluo Python

A Benchmark of Real-world Image Dataset for Federated Learning

42 11 42

gli Graph-Learning-Benchmarks Python

🗂 Graph Learning Indexer: a contributor-friendly and metadata-rich platform for graph learning benchmarks. Dataloading, Benchmarking, Tagging, and mor...

42 20 42