Most popular benchmark repositories and open source projects

ecs oneclickvirt Go

VPS Fusion Monster Server Test GO Version Aiming to be the most comprehensive server testing project, implemented in Go with zero environment dependen...

2k 130 2k

Awesome-LLM-Long-Context-Modeling Xnhyacinth

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

2k 83 2k

logparser logpai Python

A machine learning toolkit for log parsing [ICSE'19, DSN'16]

2k 581 2k

web-to-desktop-framework-comparison Elanis JavaScript

An objective comparison of multiple frameworks that allow us to "transform" our web apps to desktop applications.

1.9k 53 1.9k

1m-go-tcp-server smallnest Go

benchmarks for implementation of servers which support 1 million connections

1.9k 355 1.9k

less_slow.cpp ashvardanian C++

Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, netwo...

1.9k 81 1.9k

tapnet google-deepmind Jupyter Notebook

Tracking Any Point (TAP)

1.9k 176 1.9k

fastRAG IntelLabs Python

Efficient Retrieval Augmentation and Generation Framework

1.8k 167 1.8k

LIBERO Lifelong-Robot-Learning Jupyter Notebook

Benchmarking Knowledge Transfer in Lifelong Robot Learning

1.8k 400 1.8k

training mlcommons Python

Reference implementations of MLPerf® training benchmarks

1.8k 587 1.8k

evalplus evalplus Python

Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024

1.7k 194 1.7k

nanobench martinus C++

Simple, fast, accurate single-header microbenchmarking functionality for C++11/14/17/20

1.7k 101 1.7k

VBench Vchitect Python

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

1.6k 110 1.6k

LLM-eval-survey MLGroupJLU

The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".

1.6k 100 1.6k

inference mlcommons Python

Reference implementations of MLPerf® inference benchmarks

1.6k 620 1.6k

py-motmetrics cheind Python

:bar_chart: Benchmark multiple object trackers (MOT) in Python

1.5k 262 1.5k

llm-colosseum OpenGenerativeAI Jupyter Notebook

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM

1.5k 179 1.5k

hotpath-rs pawurb Rust

Rust Performance Profiler & Channels Monitoring Toolkit (TUI, MCP)

1.4k 37 1.4k

BEHAVIOR-1K StanfordVL Python

BEHAVIOR-1K: a platform for accelerating Embodied AI research. Join our Discord for support: https://discord.gg/bccR5vGFEx

1.4k 185 1.4k

pytest-benchmark ionelmc Python

pytest fixture for benchmarking code

1.4k 132 1.4k

ut boost-ext C++

C++20 μ(micro)/Unit Testing framework

1.4k 133 1.4k

divan nvzqz Rust

Fast and simple benchmarking for Rust projects

1.4k 39 1.4k

FastExpressionCompiler dadhi C#

Fast Compiler for C# Expression Trees and the lightweight LightExpression alternative. Diagnostic and code generation tools for the expressions.

1.4k 94 1.4k

SLM-Lab kengz Python

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

1.3k 288 1.3k

smac oxwhirl Python

SMAC: The StarCraft Multi-Agent Challenge

1.3k 238 1.3k

Awesome-System2-Reasoning-LLM zzli2022 Python

Latest Advances on System-2 Reasoning

1.3k 79 1.3k

MedMNIST MedMNIST Python

[pip install medmnist] 18x Standardized Datasets for 2D and 3D Biomedical Image Classification

1.3k 207 1.3k

jsperf.com jsperf JavaScript

jsperf.com v2. https://github.com/h5bp/lazyweb-requests/issues/174

1.3k 126 1.3k

Attabench attaswift Swift

Microbenchmarking app for Swift with nice log-log plots

1.3k 45 1.3k

boomer myzhan Go

A better load generator for locust, written in golang.

1.2k 245 1.2k

github-action-benchmark benchmark-action TypeScript

GitHub Action for continuous benchmarking to keep performance

1.2k 181 1.2k

bench-scripts haydenjames

A compilation of Linux server benchmarking scripts.

1.2k 170 1.2k

flow flow-project Python

Computational framework for reinforcement learning in traffic control

1.2k 393 1.2k

appdocs sjtuhjh

Application Performance Optimization Summary

1.2k 466 1.2k

LongBench THUDM Python

LongBench v2 and LongBench (ACL 25'&24')

1.2k 127 1.2k

TBCF Escheee Objective-C

Tracking Benchmark for Correlation Filters

1.1k 325 1.1k

PDEBench pdebench Python

PDEBench: An Extensive Benchmark for Scientific Machine Learning

1.1k 145 1.1k

kg-gen stair-lab Python

[NeurIPS '25] Knowledge Graph Generation from Any Text

1.1k 165 1.1k

omnisafe PKU-Alignment Python

JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.

1.1k 154 1.1k

OpenSTL chengtan9907 Python

OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning

1.1k 187 1.1k

primesieve kimwalisch C++

🚀 Fast prime number generator

1.1k 133 1.1k

VectorDBBench zilliztech Python

Benchmark for vector databases.

1.1k 366 1.1k

tau2-bench sierra-research Python

τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

1.1k 273 1.1k

lzbench inikep C

lzbench is an in-memory benchmark of open-source compressors

1.1k 205 1.1k

memtier_benchmark redis C++

NoSQL Redis and Memcache traffic generation and benchmarking tool.

1k 242 1k

benchmark pytorch Python

TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.

1k 333 1k

php-framework-benchmark kenjis PHP

PHP Framework Benchmark

1k 160 1k

pyperformance python Python

Python Performance Benchmark Suite

1k 202 1k

ADBench Minqi824 Python

Official Implement of "ADBench: Anomaly Detection Benchmark", NeurIPS 2022.

1k 151 1k

asv airspeed-velocity Python

Airspeed Velocity: A simple Python benchmarking tool with web-based reporting

998 204 998

benchmark

Repositories (1763)