Topic

benchmark

Repositories (1763)

ecs
ecs oneclickvirt Go

VPS Fusion Monster Server Test GO Version Aiming to be the most comprehensive server testing project, implemented in Go with zero environment dependen...

2k
Awesome-LLM-Long-Context-Modeling
Awesome-LLM-Long-Context-Modeling Xnhyacinth

📰 Must-read papers and blogs on LLM based Long Context Modeling 🔥

2k
logparser
logparser logpai Python

A machine learning toolkit for log parsing [ICSE'19, DSN'16]

2k
web-to-desktop-framework-comparison
web-to-desktop-framework-comparison Elanis JavaScript

An objective comparison of multiple frameworks that allow us to "transform" our web apps to desktop applications.

1.9k
1m-go-tcp-server
1m-go-tcp-server smallnest Go

benchmarks for implementation of servers which support 1 million connections

1.9k
less_slow.cpp
less_slow.cpp ashvardanian C++

Playing around "Less Slow" coding practices in C++ 20, C, CUDA, PTX, & Assembly, from numerics & SIMD to coroutines, ranges, exception handling, netwo...

1.9k
tapnet
tapnet google-deepmind Jupyter Notebook

Tracking Any Point (TAP)

1.9k
fastRAG
fastRAG IntelLabs Python

Efficient Retrieval Augmentation and Generation Framework

1.8k
LIBERO
LIBERO Lifelong-Robot-Learning Jupyter Notebook

Benchmarking Knowledge Transfer in Lifelong Robot Learning

1.8k
training
training mlcommons Python

Reference implementations of MLPerf® training benchmarks

1.8k
evalplus
evalplus evalplus Python

Rigourous evaluation of LLM-synthesized code - NeurIPS 2023 & COLM 2024

1.7k
nanobench
nanobench martinus C++

Simple, fast, accurate single-header microbenchmarking functionality for C++11/14/17/20

1.7k
VBench
VBench Vchitect Python

[CVPR2024 Highlight] VBench - We Evaluate Video Generation

1.6k
LLM-eval-survey
LLM-eval-survey MLGroupJLU

The official GitHub page for the survey paper "A Survey on Evaluation of Large Language Models".

1.6k
inference
inference mlcommons Python

Reference implementations of MLPerf® inference benchmarks

1.6k
py-motmetrics
py-motmetrics cheind Python

:bar_chart: Benchmark multiple object trackers (MOT) in Python

1.5k
llm-colosseum
llm-colosseum OpenGenerativeAI Jupyter Notebook

Benchmark LLMs by fighting in Street Fighter 3! The new way to evaluate the quality of an LLM

1.5k
hotpath-rs
hotpath-rs pawurb Rust

Rust Performance Profiler & Channels Monitoring Toolkit (TUI, MCP)

1.4k
BEHAVIOR-1K
BEHAVIOR-1K StanfordVL Python

BEHAVIOR-1K: a platform for accelerating Embodied AI research. Join our Discord for support: https://discord.gg/bccR5vGFEx

1.4k
pytest-benchmark
pytest-benchmark ionelmc Python

pytest fixture for benchmarking code

1.4k
ut
ut boost-ext C++

C++20 μ(micro)/Unit Testing framework

1.4k
divan
divan nvzqz Rust

Fast and simple benchmarking for Rust projects

1.4k
FastExpressionCompiler
FastExpressionCompiler dadhi C#

Fast Compiler for C# Expression Trees and the lightweight LightExpression alternative. Diagnostic and code generation tools for the expressions.

1.4k
SLM-Lab
SLM-Lab kengz Python

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

1.3k
smac
smac oxwhirl Python

SMAC: The StarCraft Multi-Agent Challenge

1.3k
Awesome-System2-Reasoning-LLM
Awesome-System2-Reasoning-LLM zzli2022 Python

Latest Advances on System-2 Reasoning

1.3k
MedMNIST
MedMNIST MedMNIST Python

[pip install medmnist] 18x Standardized Datasets for 2D and 3D Biomedical Image Classification

1.3k
jsperf.com
jsperf.com jsperf JavaScript

jsperf.com v2. https://github.com/h5bp/lazyweb-requests/issues/174

1.3k
Attabench
Attabench attaswift Swift

Microbenchmarking app for Swift with nice log-log plots

1.3k
boomer
boomer myzhan Go

A better load generator for locust, written in golang.

1.2k
github-action-benchmark
github-action-benchmark benchmark-action TypeScript

GitHub Action for continuous benchmarking to keep performance

1.2k
bench-scripts
bench-scripts haydenjames

A compilation of Linux server benchmarking scripts.

1.2k
flow
flow flow-project Python

Computational framework for reinforcement learning in traffic control

1.2k
appdocs
appdocs sjtuhjh

Application Performance Optimization Summary

1.2k
LongBench
LongBench THUDM Python

LongBench v2 and LongBench (ACL 25'&24')

1.2k
TBCF
TBCF Escheee Objective-C

Tracking Benchmark for Correlation Filters

1.1k
PDEBench
PDEBench pdebench Python

PDEBench: An Extensive Benchmark for Scientific Machine Learning

1.1k
kg-gen
kg-gen stair-lab Python

[NeurIPS '25] Knowledge Graph Generation from Any Text

1.1k
omnisafe
omnisafe PKU-Alignment Python

JMLR: OmniSafe is an infrastructural framework for accelerating SafeRL research.

1.1k
OpenSTL
OpenSTL chengtan9907 Python

OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning

1.1k
primesieve
primesieve kimwalisch C++

🚀 Fast prime number generator

1.1k
VectorDBBench
VectorDBBench zilliztech Python

Benchmark for vector databases.

1.1k
tau2-bench
tau2-bench sierra-research Python

τ-Bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains

1.1k
lzbench
lzbench inikep C

lzbench is an in-memory benchmark of open-source compressors

1.1k
memtier_benchmark
memtier_benchmark redis C++

NoSQL Redis and Memcache traffic generation and benchmarking tool.

1k
benchmark
benchmark pytorch Python

TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.

1k
php-framework-benchmark
php-framework-benchmark kenjis PHP

PHP Framework Benchmark

1k
pyperformance
pyperformance python Python

Python Performance Benchmark Suite

1k
ADBench
ADBench Minqi824 Python

Official Implement of "ADBench: Anomaly Detection Benchmark", NeurIPS 2022.

1k
asv
asv airspeed-velocity Python

Airspeed Velocity: A simple Python benchmarking tool with web-based reporting

998