Topic

benchmark

Repositories (1623)

nyt-connections
nyt-connections lechmazur Python

Benchmark that evaluates LLMs using 651 NYT Connections puzzles extended with extra trick words

93
PAD
PAD EricLee0224 Python

[NeurlPS 2023] A Dataset and Benchmark for Pose-agnostic Anomaly Detection.

93
GLUE-X
GLUE-X YangLinyi Python

We leverage 14 datasets as OOD test data and conduct evaluations on 8 NLU tasks over 21 popularly used models. Our findings confirm that the OOD accur...

93
WebAssembly-benchmark
WebAssembly-benchmark takahirox HTML
93
BIRL
BIRL Borda Python

BIRL: Benchmark on Image Registration methods with Landmark validations

93
LaBench
LaBench microsoft Go

Latency Benchmarking tool

93
cropandweed-dataset
cropandweed-dataset cropandweed Python

[WACV 2023] Information and scripts for the CropAndWeed Dataset

92
vault-benchmark
vault-benchmark hashicorp Go

A tool for benchmarking usage of Vault.

92
holisticai
holisticai holistic-ai Jupyter Notebook

This is an open-source tool to assess and improve the trustworthiness of AI systems.

92
cec2017-py
cec2017-py tilleyd Python

Python module for CEC 2017 single objective optimization test function suite.

92
ddio-bench
ddio-bench aliireza Makefile

Reexamining Direct Cache Access to Optimize I/O Intensive Applications for Multi-hundred-gigabit Networks

92
shellbench
shellbench shellspec Shell

A benchmark utility for POSIX shell comparison

92
HArray
HArray Bazist C++

Fastest Trie structure (Linux & Windows)

92
karma-benchmark
karma-benchmark JamieMason TypeScript

A Karma plugin to run Benchmark.js over multiple browsers with CI compatible output.

91
Domain-generalization-fault-diagnosis-benchmark
Domain-generalization-fault-diagnosis-benchmark CHAOZHAO-1 Python

This is a benckmark for domain generalization-based fault diagnosis (基于领域泛化的相关代码)

90
umesimd
umesimd edanor C++

UME::SIMD A library for explicit simd vectorization.

90
chembench
chembench lamalab-org Python

How good are LLMs at chemistry?

89
php-benchmarks
php-benchmarks EFTEC PHP

It is a collection of php benchmarks

89
drone-tracking
drone-tracking flyers MATLAB

DTB70 -- A Drone Tracking Benchmark

89
serverreview-benchmark
serverreview-benchmark sayem314 Shell

Serverreview Benchmark Script v3

89
rnn_benchmarks
rnn_benchmarks stefbraun Python

RNN benchmarks of pytorch, tensorflow and theano

89
javascript-serialization-benchmark
javascript-serialization-benchmark Adelost TypeScript

Comparison and benchmark of JavaScript serialization libraries (Protocol Buffer, Avro, BSON, etc.)

88
pibench
pibench sfu-dis C++

Benchmarking framework for index structures on persistent memory

88
Microbenchmarks.jl
Microbenchmarks.jl JuliaLang Jupyter Notebook

Microbenchmarks comparing the Julia Programming language with other languages

88
krun
krun softdevteam Python

High fidelity benchmark runner

88
ansibench
ansibench nfinit C

A selection of ANSI C benchmarks and programs useful as benchmarks

88
NetworkBenchmarkDotNet
NetworkBenchmarkDotNet JohannesDeml C#

Low-level dotnet network benchmark for UDP socket performance (.NET and Unity compatible)

87
FactCHD
FactCHD zjunlp Python

[IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection

87
Pano3D
Pano3D VCL3D Python

Code and models for "Pano3D: A Holistic Benchmark and a Solid Baseline for 360 Depth Estimation", OmniCV Workshop @ CVPR21.

87
Wild-Places
Wild-Places csiro-robotics Python

🏞️ [IEEE ICRA2023] The official repository for paper "Wild-Places: A Large-Scale Dataset for Lidar Place Recognition in Unstructured Natural Environm...

87
Benchmark
Benchmark WorldDownTown Swift

The Benchmark⏲ module provides methods to measure and report the time used to execute Swift code.

86
Awesome-Large-Recommendation-Models
Awesome-Large-Recommendation-Models USTC-StarTeam

🔥🔥🔥 Latest Advances on Large Recommendation Models

86
LOVA3
LOVA3 showlab Python

(NeurIPS 2024) Official PyTorch implementation of LOVA3

86
enterprise-h2ogpte
enterprise-h2ogpte h2oai Python

Client Code Examples, Use Cases and Benchmarks for Enterprise h2oGPTe RAG-Based GenAI Platform

86
mujoco-benchmark
mujoco-benchmark ChenDRAG

Provide full reinforcement learning benchmark on mujoco environments, including ddpg, sac, td3, pg, a2c, ppo, library

85
BenchLMM
BenchLMM AIFEG Python

[ECCV 2024] BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models

85
GL_vs_VK
GL_vs_VK RippeR37 C++

Comparison of OpenGL and Vulkan API in terms of performance.

85
turkish-lm-tuner
turkish-lm-tuner boun-tabi-LMG Python

Turkish LM Tuner

85
gem_bench
gem_bench pboling Ruby

Benchmark different versions of same or similar gems & Static Gemfile and installed gem library source code analysis

85
pepe
pepe omarmhaimdat Rust

HTTP Load Generator

85
fast
fast adhocore Go

Check your internet speed/bandwidth right from your terminal. Built on Golang using chromedp

85
ncnn-android-benchmark
ncnn-android-benchmark nihui C++

ncnn android benchmark app

85
rb
rb klingtnet Rust

A thread-safe fixed-size circular buffer written in safe Rust.

85
wikidata-simplequestions
wikidata-simplequestions askplatypus Jupyter Notebook

Mapping of the SimpleQuestions dataset to Wikidata

85
tum-traffic-dataset-dev-kit
tum-traffic-dataset-dev-kit tum-traffic-dataset Python

TUM Traffic Dataset Development Kit

84
locust4j
locust4j myzhan Java

Locust4j is a load generator for locust, written in Java.

84
cuda_scheduling_examiner_mirror
cuda_scheduling_examiner_mirror yalue Cuda

A tool for examining GPU scheduling behavior.

84
ElegantMustard
ElegantMustard lscambo13

An elegant RTSS Overlay to showcase your benchmark stats in style.

83
pytest-django-queries
pytest-django-queries NyanKiyoshi Python

Generate performance reports from your django database performance tests.

83
webassembly-benchmarks
webassembly-benchmarks jedisct1

Libsodium WebAssembly benchmarks results.

83