🪐 A Database of Existing Security Vulnerabilities Patches to Enable Evaluation of Techniques (single-commit; multi-language)
Benchmark for Multi-Scenario-Recommendation.
Performance comparisons of bundlers and build tools, including Rspack, Rsbuild, webpack, Vite and Farm.
Benchmarking Vision-Language Models on OCR tasks in Dynamic Video Environments
A universal database query benchmark tool
⏱️ Reliable performance measurement for Go programs. All in one design.
A simple benchmark which measures latency between CPU cores.
Catalyst.RL: A Distributed Framework for Reproducible RL Research
JS validators benchmark
A benchmark suite for measuring HDF5 performance.
The Multitask Long Document Benchmark
A lightweight Python package for performance testing of Python functions.
Benchmarking FASTQ compression with 'mature' compression algorithms
[ACL 2025 🔥] A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding
java filter benchmarks
Continual Reinforcement Learning in 3D Non-stationary Environments
Welcome to Snowman App – a Data Matching Benchmark Platform.
SQLite TPCH database
[TPAMI 2024] A Survey and Benchmark for Automatic Surface Reconstruction from Point Clouds
A Scandinavian Benchmark for sentence embeddings
GHOSTS dataset
vps聚合测试脚本,直接输出排版好的markdown格式,方便粘贴
[NeurIPS'25] MLLM-CompBench evaluates the comparative reasoning of MLLMs with 40K image pairs and questions across 8 dimensions of relative comparison...
First-of-its-kind AI benchmark for evaluating the protection capabilities of large language model (LLM) guard systems (guardrails and safeguards)
ColdRec: An Open-Source Benchmark Toolbox for Cold-Start Recommendation.
Seeing from Another Perspective: Evaluating Multi-View Understanding in MLLMs
Compare simple REST server performance in Node.js and Go
Benchmarking the best way to store long, Object value pairs in a map.
String matching algorithm benchmark
NAS Benchmark in "Prioritized Architecture Sampling with Monto-Carlo Tree Search", CVPR2021
A challenge on the mapping of satellite altimeter sea surface height data organised by MEOM@IGE, Ocean-Next and CLS.
This repository contains my TensorTrade-focused code, including the core program and supplemental tools used in my bachelor's thesis on trading low ma...
Fair ML in credit scoring: Assessment, implementation and profit implications
Tools such as plots and metrics to analyze (simulated) reviews for ASReview LAB
🛸 NVMe / 🚀 SSD / 🖴 HDD S.M.A.R.T Monitoring. Site: https://diskcheck.monster
Benchmarking Rust storage engines
CSSegmentation: An Open Source Continual Semantic Segmentation Toolbox Based on PyTorch.
WfCommons: A Framework for Enabling Scientific Workflow Research and Development
Evaluation framework for testing segmentation networks in Keras
Evaluate your word embeddings
horoscope is an optimizer inspector for DBMS.
A couple of examples showing how pytest and its plugins can be combined to solve real-world needs.
Deep Learning Hard (DL-HARD) is a new annotated dataset extending TREC Deep Learning benchmark.
Embeddings: State-of-the-art Text Representations for Natural Language Processing tasks, an initial version of library focus on the Polish Language
Cost/performance analysis of index structures on SSD and persistent memory (CIDR 2022)
Benchmarking pairwise aligners
:hugs: AeroPath: An airway segmentation benchmark dataset with challenging pathology
PhyX: Does Your Model Have the "Wits" for Physical Reasoning?