Most popular benchmark repositories and open source projects

dnstrace redsift Go

Command-line DNS benchmark

82 9 82

ruby-performance-tools JuanitoFatas

List of Ruby Tools for doing Performance.

82 5 82

MedAgentBench stanfordmlgroup Python

MedAgentBench: A Realistic Virtual EHR Environment to Benchmark Medical LLM Agents

82 17 82

logs-benchmark SigNoz Shell

Logs performance benchmark repo: Comparing Elastic, Loki and SigNoz

82 3 82

LearnedSort anikristo C++

Learned Sort: a model-enhanced sorting algorithm

81 12 81

php-orm-benchmark sergeyklay PHP

The benchmark to compare performance of PHP ORM solutions.

81 6 81

llm-benchmark lework Python

LLM 并发性能测试工具，支持自动化压力测试和性能报告生成。

81 14 81

tasty-bench Bodigrim Haskell

Featherlight benchmark framework, drop-in replacement for criterion and gauge.

81 13 81

WritingBench X-PLUG Python

WritingBench: A Comprehensive Benchmark for Generative Writing

81 9 81

vllm-safety-benchmark UCSC-VLAA Python

[ECCV 2024] Official PyTorch Implementation of "How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs"

81 4 81

perforator zyedidia Go

Record "perf" performance metrics for individual functions/regions of an ELF binary.

81 5 81

benchinit mvdan Go

Benchmark the init cost of Go packages

81 3 81

http-benchmark-tornado junneyang Python

基于Python Tornado的高性能http性能测试工具。Java Netty版: https://github.com/junneyang/http-benchmark-netty 。

81 48 81

gobench gobench-io HTML

A benchmark framework based on Golang

80 15 80

indonlg IndoNLP Python

The first-ever vast natural language generation benchmark for Indonesian, Sundanese, and Javanese. We provide multiple downstream tasks, pre-trained I...

80 14 80

MDBenchmark bio-phys Python

Quickly generate, start and analyze benchmarks for molecular dynamics simulations.

80 17 80

WildScenes csiro-robotics Python

[IJRR2024] The official repository for the WildScenes: A Benchmark for 2D and 3D Semantic Segmentation in Natural Environments

80 4 80

ASR_benchmark Franck-Dernoncourt Python

Program to benchmark various speech recognition APIs

80 18 80

vector-db-benchmark myscale Python

Framework for benchmarking fully-managed vector databases

79 19 79

sugar-crepe RAIVNLab Python

[NeurIPS 2023] A faithful benchmark for vision-language compositionality

79 9 79

DotNet-Collections-Benchmark mjebrahimi C#

🚀 A comprehensive performance comparison benchmark between different .NET collections.

79 7 79

router-benchmark delvedor JavaScript

Benchmark of the most commonly used http routers

79 17 79

gocannon kffl Go

:boom: Performance-focused HTTP load testing tool written in Go

78 8 78

contender flashbots Rust

run highly configurable benchmarks for EVM-based execution nodes over JSON-RPC

78 27 78

Uniaa KwaiVGI Python

Unified Multi-modal IAA Baseline and Benchmark

78 5 78

Mercury Elfsong Jupyter Notebook

Code Efficiency Benchmark

78 9 78

PointCloudMatters HaoyiZhu Python

[NeurIPS 2024 D&B] Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning

78 3 78

OpenRCA microsoft Python

[ICLR'25] OpenRCA: Can Large Language Models Locate the Root Cause of Software Failures?

77 7 77

php-di-container-benchmarks kocsismate PHP

Benchmark for some popular PHP Dependency Injection Containers.

77 26 77

arewefastyet vitessio Go

Automated Benchmarking System for Vitess

76 57 76

RWKU jinzhuoran Python

RWKU: Benchmarking Real-World Knowledge Unlearning for Large Language Models. NeurIPS 2024

75 6 75

sightglass bytecodealliance C

A benchmark suite and tool to compare different implementations of the same primitives.

75 36 75

benchable MatheusRich Ruby

Write benchmarks without the hassle.

75 1 75

Elysium Hon-Wong Python

[ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM

74 4 74

lhbench lhbench Scala

Lakehouse storage system benchmark

74 10 74

TLCBench tlc-pack Python

Benchmark scripts for TVM

74 28 74

ipc_benchmark detailyang Python

IPC benchmark on Linux

74 55 74

apebench tum-pbs Python

[Neurips 2024] A benchmark suite for autoregressive neural emulation of PDEs. (≥46 PDEs in 1D, 2D, 3D; Differentiable Physics; Unrolled Training; Roll...

73 1 73