Distributed database benchmark tester
CLI tool for a high QPS DNS benchmark
TurboRLE-Fastest Run Length Encoding
JMH benchmark of Java object-to-object mapping frameworks
Awesome diffusion Video-to-Video (V2V). A collection of paper on diffusion model-based video editing, aka. video-to-video (V2V) translation. And a vid...
Benchmarking physical understanding in generative video models
Pantheon of Congestion Control
Automated STIG Benchmark Compliance Remediation for RHEL 7 with Ansible
BenchExec: A Framework for Reliable Benchmarking and Resource Measurement
🔥 Synthetic and real-world 2d/3d dataset for semantic and instance segmentation (BMVC 2022 Oral)
TensorFlow Metal Backend on Apple Silicon Experiments (just for fun)
comparing the performance of different template engines
Web-Bench is a benchmark designed to evaluate the performance of LLMs in actual Web development.
Dataset and code for the paper "First-Person Hand Action Benchmark with RGB-D Videos and 3D Hand Pose Annotations", CVPR 2018.
Benchmarking framework for protein representation learning. Includes a large number of pre-training and downstream task datasets, models and training/...
A large-scale benchmark for machine learning methods in fluid dynamics
PostgreSQL Benchmarking Toolkit
A super simple tool to benchmark GraphQL queries
State-of-the-art methods on monocular 3D pose estimation / 3D mesh recovery
[ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".
🔥[NeurIPS'25] DeepFund: Pilot for Your Next Fund Investment
Official implementation for WorldScore: A Unified Evaluation Benchmark for World Generation
A distributed storage benchmark for file systems, object stores & block devices with support for GPUs
Ultra-fast websocket client and server for asyncio
A micro Vulkan compute pipeline and a collection of benchmarking compute shaders
VideoGen-Eval: Agent-based System for Video Generation Evaluation
MedAgentBench: A Realistic Virtual EHR Environment to Benchmark Medical LLM Agents
The definitive benchmark for AI agents on OpenClaw. 45 tasks across 4 tiers. Powered by MyClaw.ai
[TPAMI 2026] Large-Scale 3D Medical Image Pre-training with Geometric Context Priors
App Servers benchmarked for: Ruby, Python, JavaScript, Dart, Elixir, Java, Crystal, Nim, GO, Rust
IMDBench — Realistic ORM benchmarking
The official evaluation suite and dynamic data release for MixEval.
Foundation model benchmarking tool. Run any model on any AWS platform and benchmark for performance across instance type and serving stack options.
Benchmark Kubernetes persistent disk volumes with fio: Read/write IOPS, bandwidth MB/s and latency
⏱️ single header benchmark framework for C and C++
Time-Series Anomaly Detection | Algorithms + Datasets + Tutorials
Blue Team Scripts
OpenXAI : Towards a Transparent Evaluation of Model Explanations
Automated Chrome tracing for benchmarking.
Official code for CVPR 2022 (Oral) paper "Deep Visual Geo-localization Benchmark"
High-precision, one-shot and consistent benchmarking framework/harness for Rust. All Valgrind tools at your fingertips.
Automated CIS Benchmark Compliance Remediation for Ubuntu 22 with Ansible
LexGLUE: A Benchmark Dataset for Legal Language Understanding in English
LLM 并发性能测试工具,支持自动化压力测试和性能报告生成。
🔥🔥MLVU: Multi-task Long Video Understanding Benchmark
Memory profiling benchmark style, for Ruby 2.1+
Benchmarking Agentic LLM and VLM Reasoning On Games
YACCLAB: Yet Another Connected Components Labeling Benchmark
Benchmarking repo for secrets scanning