Topic

benchmark

Repositories (1623)

leaderboard
leaderboard KGQA Jupyter Notebook

You can find the most recent KGQA benchmark numbers from publications here.

128
TSB-AD
TSB-AD TheDatumOrg Python

TSB-AD: Towards A Reliable Time-Series Anomaly Detection Benchmark

127
V2X-Sim
V2X-Sim ai4ce

[RA-L2022] V2X-Sim Dataset and Benchmark

127
factory_bot_instruments
factory_bot_instruments shiroyasha Ruby

Instruments for benchmarking, tracing, and debugging Factory Girl models.

126
PerformanceBenchmarkReporter
PerformanceBenchmarkReporter Unity-Technologies JavaScript

The Unity Performance Benchmark tool enables partners and developers to establish benchmark samples and measurements using the Performance Testing pac...

126
pddl-instances
pddl-instances potassco Common Lisp

🌍 PDDL instances covering the International Planning Competitions

126
optimum-transformers
optimum-transformers AlekseyKorshuk Python

Accelerated NLP pipelines for fast inference on CPU and GPU. Built with Transformers, Optimum and ONNX Runtime.

126
quantum-benchmarks
quantum-benchmarks yardstiq OpenQASM

benchmarking quantum circuit emulators for your daily research usage

125
facies_classification_benchmark
facies_classification_benchmark yalaudah Python

The repository includes PyTorch code, and the data, to reproduce the results for our paper titled "A Machine Learning Benchmark for Facies Classificat...

125
Video-Bench
Video-Bench PKU-YuanGroup Python

A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models!

125
LLM-Agent-Benchmark-List
LLM-Agent-Benchmark-List zhangxjohn

A banchmark list for evaluation of large language models.

125
VCSL
VCSL alipay Python

Video Copy Segment Localization (VCSL) dataset and benchmark [CVPR2022]

125
Deep_GCN_Benchmarking
Deep_GCN_Benchmarking VITA-Group Python

[TPAMI 2022] "Bag of Tricks for Training Deeper Graph Neural Networks A Comprehensive Benchmark Study" by Tianlong Chen*, Kaixiong Zhou*, Keyu Duan, W...

125
volley
volley jonhoo C

Volley is a benchmarking tool for measuring the performance of server networking stacks.

124
meta-blocks
meta-blocks alshedivat Python

A modular toolbox for meta-learning research with a focus on speed and reproducibility.

124
Audit-Test-Automation
Audit-Test-Automation fbprogmbh PowerShell

FBPro Audit Test Automation Package allows you to create compliance reports for your systems. The resulting HTML-reports provide a transparent overvie...

124
Python_Portfolio__VaR_Tool
Python_Portfolio__VaR_Tool MBKraus Python

Python-based portfolio / stock widget which sources data from Yahoo Finance and calculates different types of Value-at-Risk (VaR) metrics and many oth...

123
jvm-performance-benchmarks
jvm-performance-benchmarks ionutbalosin Java

Java Virtual Machine (JVM) Performance Benchmarks with a primary focus on top-tier Just-In-Time (JIT) Compilers, such as C2 JIT, Graal JIT, and the Fa...

123
MVP_Benchmark
MVP_Benchmark paul007pl Python

MVP Benchmark for Multi-View Partial Point Cloud Completion and Registration

123
SciCode
SciCode scicode-bench Python

A benchmark that challenges language models to code solutions for scientific problems

123
core50
core50 vlomonaco Python

CORe50: a new Dataset and Benchmark for Continual Learning

122
clinical-trial-outcome-prediction
clinical-trial-outcome-prediction futianfan Python

benchmark dataset and Deep learning method (Hierarchical Interaction Network, HINT) for clinical trial approval probability prediction, published in C...

122
Co-learning
Co-learning chengtan9907 Python

The official implementation of the ACM MM'21 paper Co-learning: Learning from noisy labels with self-supervision.

121
tape-neurips2019
tape-neurips2019 songlab-cal Python

Tasks Assessing Protein Embeddings (TAPE), a set of five biologically relevant semi-supervised learning tasks spread across different domains of prote...

120
csgo-benchmark
csgo-benchmark samisalreadytaken Squirrel

Benchmark CS:GO on any map

120
meter
meter OleksandrKucherenko Java

Meter - is a simple micro-benchmarking tool for Android (and Java) projects. This is not a profiler, this is very small utility class that designed fo...

119
cult
cult asmjit C++

CPU Ultimate Latency Test.

119
Touchstone
Touchstone MrGiovanni Jupyter Notebook

[NeurIPS 2024] Touchstone - Benchmarking AI on 5,172 o.o.d. CT volumes and 9 anatomical structures

118
node-frameworks-benchmark
node-frameworks-benchmark hbakhtiyor JavaScript

Simple HTTP benchmark for different nodejs frameworks using wrk

117
MM-NIAH
MM-NIAH OpenGVLab Python

[NeurIPS 2024] Needle In A Multimodal Haystack (MM-NIAH): A comprehensive benchmark designed to systematically evaluate the capability of existing MLL...

117
react-native-startup-time
react-native-startup-time doomsower Java

measure startup time of your react-native app

117
MCPBench
MCPBench modelscope Python

The evaluation benchmark on MCP servers

117
aurora
aurora rese1f Python

[ICLR 2025] AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark

117
WorfBench
WorfBench zjunlp Python

[ICLR 2025] Benchmarking Agentic Workflow Generation

117
MMLU-CF
MMLU-CF microsoft

A Contamination-free Multi-task Language Understanding Benchmark [Official, ACL 2025]

117
benchmark
benchmark TheDragonCode PHP

Simple comparison of code execution speed between different options

116
caliper-benchmarks
caliper-benchmarks hyperledger-caliper JavaScript

Sample benchmark files for Hyperledger Caliper https://wiki.hyperledger.org/display/caliper

116
dana
dana google JavaScript

Test/benchmark regression and comparison system with dashboard

116
CharXiv
CharXiv princeton-nlp Python

[NeurIPS 2024] CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs

115
pint-benchmark
pint-benchmark lakeraai Jupyter Notebook

A benchmark for prompt injection detection systems.

115
MTAD
MTAD OpsPAI Python

MTAD: Tools and Benchmark for Multivariate Time Series Anomaly Detection

115
rvv-bench
rvv-bench camel-cdr Assembly

A collection of RISC-V Vector (RVV) benchmarks to help developers write portably performant RVV code

115
less_slow.py
less_slow.py ashvardanian Python

Playing around "Less Slow" coding practices in Python, from numerical micro-kernels to coroutines, ranges, and polymorphic state machines

114
react-benchmark
react-benchmark Rowno JavaScript

A tool for benchmarking the render performance of React components

113
SpeedTests
SpeedTests jabbalaci Python

comparing the execution speeds of various programming languages

113
less_slow.rs
less_slow.rs ashvardanian Rust

Playing around "Less Slow" coding practices in Rust, from numerical micro-kernels to coroutines, ranges, and polymorphic state machines

112
benchyou
benchyou xelabs Go

benchyou is a benchmark tool for MySQL, real-time monitoring TPS and vmstat/iostat

111
rust-web-frameworks-benchmark
rust-web-frameworks-benchmark rousan Rust

A hello world benchmark for the available Rust Web Frameworks: hyper vs gotham vs actix-web vs warp vs rocket

111
local-planning-benchmark
local-planning-benchmark NKU-MobFly-Robotics C++

[ICRA2021] A unified benchmark for the evaluation of mobile robot local planning approaches

111
Shuhai
Shuhai RC4ML SystemVerilog

Shuhai is a benchmarking-memory tool that allows FPGA programmers to demystify all the underlying details of memories, e.g., HBM and DDR4, on a Xilinx...

110