Most popular benchmark repositories and open source projects

build-tools-performance

Performance comparisons of bundlers and build tools, including Rspack,...

1   40   40  

ocr-benchmark

Benchmarking Vision-Language Models on OCR tasks in Dynamic Video Envi...

3   40   40  

DatabaseBenchmark

A universal database query benchmark tool

3   40   40  

compression_benchmark

Benchmarking FASTQ compression with 'mature' compression algorithms

4   39   39  

KITAB-Bench

[ACL 2025 🔥] A Comprehensive Multi-Domain Benchmark for Arabic OCR an...

2   39   39  

h5bench

A benchmark suite for measuring HDF5 performance.

29   39   39  

muld

The Multitask Long Document Benchmark

1   39   39  

perftester

A lightweight Python package for performance testing of Python functio...

0   39   39  

bench

⏱️ Reliable performance measurement for Go programs. All in one design...

5   39   39  

core-latency

A simple benchmark which measures latency between CPU cores.

5   39   39  

catalyst-rl-framework

Catalyst.RL: A Distributed Framework for Reproducible RL Research

3   39   39  

validator-benchmark

JS validators benchmark

19   39   39  

ExecutorBenchmark

18   38   38  

javafilters-benchmarks

java filter benchmarks

5   38   38  

crlmaze

Continual Reinforcement Learning in 3D Non-stationary Environments

4   38   38  

snowman

Welcome to Snowman App – a Data Matching Benchmark Platform.

2   38   38  

TPCH-sqlite

SQLite TPCH database

14   38   38  

scandinavian-embedding-benchmark

A Scandinavian Benchmark for sentence embeddings

5   38   38  

language_performance_prime_algorithm

language speed test of running is_prime function

19   38   38  

GHOSTS

GHOSTS dataset

6   38   38  

NodeBench

vps聚合测试脚本,直接输出排版好的markdown格式,方便粘贴

5   38   38  

MLLM-CompBench

[NeurIPS'25] MLLM-CompBench evaluates the comparative reasoning of MLL...

2   38   38  

circle-guard-bench

First-of-its-kind AI benchmark for evaluating the protection capabilit...

2   38   38  

ColdRec

ColdRec: An Open-Source Benchmark Toolbox for Cold-Start Recommendatio...

6   38   38  

All-Angles-Bench

Seeing from Another Perspective: Evaluating Multi-View Understanding i...

3   38   38  

dsr-benchmark

[TPAMI 2024] A Survey and Benchmark for Automatic Surface Reconstructi...

4   38   38  

Fair_Credit_Scoring

Fair ML in credit scoring: Assessment, implementation and profit impli...

18   37   37  

asreview-insights

Tools such as plots and metrics to analyze (simulated) reviews for ASR...

13   37   37  

NVMe-SSD-HDD-S.M.A.R.T-Monitoring

🛸 NVMe / 🚀 SSD / 🖴 HDD S.M.A.R.T Monitoring. Site: https://diskcheck...

0   37   37  

rest-bench

Compare simple REST server performance in Node.js and Go

8   37   37  

Long-Map-Benchmarks

Benchmarking the best way to store long, Object value pairs in a map.

2   37   37  

stringbench

String matching algorithm benchmark

6   37   37  

NAS-Bench-Macro

NAS Benchmark in "Prioritized Architecture Sampling with Monto-Carlo T...

8   37   37  

2020a_SSH_mapping_NATL60

A challenge on the mapping of satellite altimeter sea surface height d...

10   37   37  

tensortrade

This repository contains my TensorTrade-focused code, including the co...

12   37   37  

rust-storage-bench

Benchmarking Rust storage engines

6   37   37  

cssegmentation

CSSegmentation: An Open Source Continual Semantic Segmentation Toolbox...

4   37   37  

AeroPath

:hugs: AeroPath: An airway segmentation benchmark dataset with challen...

7   36   36  

PhyX

PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

0   36   36  

pgg_bench

Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agen...

2   36   36  

embeddings

Embeddings: State-of-the-art Text Representations for Natural Language...

3   36   36  

ssd-vs-pm

Cost/performance analysis of index structures on SSD and persistent me...

1   36   36  

pa-bench

Benchmarking pairwise aligners

1   36   36  

segmentation-networks-benchmark

Evaluation framework for testing segmentation networks in Keras

6   36   36  

embedding_evaluation

Evaluate your word embeddings

11   36   36  

horoscope

horoscope is an optimizer inspector for DBMS.

10   36   36  

pytest-patterns

A couple of examples showing how pytest and its plugins can be combine...

4   36   36  

DL-Hard

Deep Learning Hard (DL-HARD) is a new annotated dataset extending TREC...

4   36   36  

worker-threads-NodeJS

Benchmark nodeJS worker threads for calculating prime numbers, using v...

5   35   35  

Edge-Detection-project

Tiny Image in Javascript - Edge Detection Algorithms

9   35   35