Most popular benchmark repositories and open source projects

pspdfkit-webassembly-benchmark

Source for the PSPDFKit WebAssembly Benchmark: http://iswebassemblyfas...

6   49   49  

modd

Dataset and Evaluation Scripts for Obstacle Detection via Semantic Seg...

5   49   49  

FLBenchmark-toolkit

Federated Learning Framework Benchmark (UniFed)

5   49   49  

swt-bench

[NeurIPS 2024] Evaluation harness for SWT-Bench, a benchmark for evalu...

6   49   49  

DataGen

[ICLR'25] DataGen: Unified Synthetic Dataset Generation via Large Lang...

1   49   49  

optimization-demo

:zap: Optimizing Python code by implementing a C++ extension

1   48   48  

vbr-devkit

Vision Benchmark in Rome Development Kit

1   48   48  

benchllama

Benchmark your local LLMs.

2   48   48  

MedIAnomaly

[MedIA 2025] MedIAnomaly: A comparative study of anomaly detection in...

5   48   48  

MCC5-THU-Gearbox-Benchmark-Datasets

A benchmark fault diagnosis dataset comprises vibration data collected...

3   48   48  

o1_medical

2   48   48  

WiMANS

[ECCV 2024] WiMANS: A Benchmark Dataset for WiFi-based Multi-user Acti...

8   48   48  

RCAEval

[ASE'24][WWW'25] RCAEval: A Benchmark for Root Cause Analysis. https:/...

8   48   48  

syntherela

A package for benchmarking synthetic relational data generation method...

1   48   48  

rust-zero-cost-abstractions

Testing out a Zero Cost Abstraction in Rust compared to similar approa...

2   48   48  

weather4cast

Code accompanying our IARAI Weather4cast Challenge

17   48   48  

mofdscribe

An ecosystem for digital reticular chemistry

9   48   48  

SeasonDepth

This package provides a python toolkit for the evaluation on the "Seas...

5   48   48  

PRUDEX-Compass

Official implementation of PRUDEX-Compass

21   47   47  

php-benchmark-script

A simple PHP script that helps you compare raw performance across serv...

10   47   47  

SWE-bench-Live

🚀 SWE-bench Goes Live!

3   47   47  

react-native-css-in-js-benchmarks

CSS in JS Benchmarks for React Native

11   47   47  

scRNAseq_cell_cluster_labeling

Scripts to run and benchmark scRNA-seq cell cluster labeling methods

14   47   47  

spatial_index_benchmark

Simple non-academic performance comparison of available open source im...

10   47   47  

hyperspectral-soilmoisture-dataset

Hyperspectral and soil-moisture data from a field campaign based on a...

13   47   47  

benchmarkify

:zap: Benchmark framework for NodeJS

3   47   47  

Unity-Pathfinding-Jobs-StressTest

Unity project showcasing A* pathfinding, fully jobified & burst compil...

10   47   47  

fizzboom

Benchmark to compare async web server + interpreter + web client imple...

7   46   46  

bots

Barcelona OpenMP Task Suite is a collection of applications that allow...

17   46   46  

heatwave-tpch

SQL scripts for HeatWave benchmarking

14   46   46  

arb

Advanced Reasoning Benchmark Dataset for LLMs

3   46   46  

ToMBench

ToMBench: Benchmarking Theory of Mind in Large Language Models, ACL 20...

4   46   46  

ReForm-Eval

An benchmark for evaluating the capabilities of large vision-language...

4   46   46  

stringzilla-benchmarks-rs

Comparing performance-oriented string-processing libraries for substri...

4   46   46  

spiko

🚀 Spiko is a fast, Rust-based load testing tool with a beautiful TUI...

4   45   45  

multimodal-needle-in-a-haystack

[NAACL 2025 Oral] Multimodal Needle in a Haystack (MMNeedle): Benchmar...

3   45   45  

Frame-Time-Analysis

web application that charts and compares multiple frame time logs at t...

4   45   45  

python-package-manager-shootout

Benchmarking various Python package managers

12   45   45  

wordpress-speedtest

VPS Speedtest for WordPress with 160 results: 🏆 UpCloud (raw memory a...

14   45   45  

cloud-workbench

Cloud WorkBench (CWB) is a web-based framework that is grounded on the...

6   45   45  

node-red-contrib-actionflows

Provides a set of nodes to enable an extendable design pattern for flo...

12   45   45  

SQL-ProcBench

SQL-ProcBench is an open benchmark for procedural workloads in RDBMSs.

8   45   45  

KcBERT-Finetune

KcBERT/KcELECTRA Fine Tune Benchmarks code (forked from https://github...

10   45   45  

RediSearchBenchmark

Benchmarks for the RediSearch module

8   44   44  

dammmdatagen

Marketing Mix Modeling Data Generator

7   44   44  

mdl-stance-robustness

Multi-dataset stance detection and robustness experiments

11   44   44  

weblink

7   44   44  

conflictbank

Code and data for "ConflictBank: A Benchmark for Evaluating the Influe...

0   44   44  

Fuzzle

Fuzzle: Making a Puzzle for Fuzzers (ASE'22)

8   44   44  

LLM-Inference-Bench

LLM-Inference-Bench

4   44   44