This repository contains the code base for the Open Stream Processing Benchmark.
a simple benchmark testing tool implemented in golang with some small features
[CVPR 2025] Science-T2I: Addressing Scientific Illusions in Image Synthesis
Unity Netcode/Network Benchmark Comparison. Fusion, Fishnet, Mirror, Mirage, Netick, NGO
DSBench: How Far are Data Science Agents from Becoming Data Science Experts?
Easy Benchmarking with PowerShell
Open Source AI Benchmarking toolkit for benchmarking speech to text services
Text style transfer benchmark
A simple, light-weight NodeJS utility for measuring code execution in high-resolution real times.
Testing different approaches to improve PHP script performance
The NAS Parallel Benchmarks for evaluating C++ parallel programming frameworks on shared-memory architectures
Human Benchmark is a Flutter app for Android, it has many tests to test your abilities.
Benchmark evaluation code for "SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal" (ICLR 2025)
LeakDB (Leakage Diagnosis Benchmark) is a realistic leakage dataset for water distribution networks. The dataset is comprised of a large number of art...
Repository for the Objects With Lighting Dataset
CVE-Bench: A Benchmark for AI Agents’ Ability to Exploit Real-World Web Application Vulnerabilities
TLS handshakes benchnarking tool
Attention-based View Selection Networks for Light-field Disparity Estimation
Show how to perform fast retraining with LightGBM in different business cases
Benchmark.js results in ASCII tables for NodeJS
Unmaintained, prefer these BenchmarkSQL forks: https://github.com/wieck/benchmarksql and https://github.com/pgsql-io/benchmarksql
Companion code for FanOutQA: Multi-Hop, Multi-Document Question Answering for Large Language Models (ACL 2024)
🌈 Visualizes your BenchmarkDotNet benchmarks to Colorful images and Feature-rich HTML (and maybe powerful charts in the future!)
SUES-200: A Multi-height Multi-scene Cross-view Image Benchmark Across Drone and Satellite
A Benchmark Suite for Heterogeneous System Computation
Tests and benchmarks for cudnn (and in the future, other nvidia libraries)
Audio performance benchmark for jitter, theoretical latency, etc.
Realtime benchmarks for PHP code
Measuring the performance of popular streaming engines with Yahoo's Streaming Benchmark
An ecosystem for digital reticular chemistry
Evaluation of Line Detection and Association
Video dataset dedicated to portrait-mode video recognition.
This is the issue repository for a typescript framework meant to performance test anything even remotely rest-like and related tools
An inquiry into nondogmatic software development. An experiment showing double performance of the code running on JVM comparing to equivalent native C...
Distributed S3 benchmarking tool - Replacement of Cosbench
[ICLR 2025] Robust Gymnasium: A Unified Modular Benchmark for Robust Reinforcement Learning.
A benchmark dataset collection for bird sound classification
Dataset and benchmark for assessing LLMs in translating natural language descriptions of planning problems into PDDL
Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLMs to engage i...
Collection of tips for faster spatial data processing in R
🔥Performance Wars Benchmarking C# - This repository contains a collection of C# benchmarks to compare the performance of different approaches to solv...
:zap: A collection of common functions but with better performance, less allocations and less dependencies created for Fiber.
Benchmark your local LLMs.
Aix-bench, the Java benchmark for code synthesis problem.
Mutual information estimators and benchmark
Code and data of the EMNLP 2022 paper "Why Should Adversarial Perturbations be Imperceptible? Rethink the Research Paradigm in Adversarial NLP".
Benchmark for evaluating open-ended generation
Benchmarking tool to stress real-time protocols