Topic

benchmark

Repositories (1623)

pgg_bench
pgg_bench lechmazur

Public Goods Game (PGG) Benchmark: Contribute & Punish is a multi-agent benchmark that tests cooperative and self-interested strategies among Large La...

36
worker-threads-NodeJS
worker-threads-NodeJS nilshah98 JavaScript

Benchmark nodeJS worker threads for calculating prime numbers, using various dataStructures

35
Edge-Detection-project
Edge-Detection-project bockp JavaScript

Tiny Image in Javascript - Edge Detection Algorithms

35
cpp-serialization-benchmark
cpp-serialization-benchmark felixguendling C++

Comparison of C++ Serialization Libraries for Graph Data

35
MaskedFaceRepresentation
MaskedFaceRepresentation sachith500 Python

Masked face recognition focuses on identifying people using their facial features while they are wearing masks. We introduce benchmarks on face verifi...

35
raytriangle-test
raytriangle-test johnnovak Nim

Ray-triangle intersection performance tests in various languages

35
ViHOS
ViHOS phusroyal Jupyter Notebook

Repository for the paper "ViHOS: Vietnamese Hate and Offensive Spans Detection" (EACL2023)

35
hdrhistogram-swift
hdrhistogram-swift HdrHistogram Swift

Swift port of HdrHistogram

35
ConBench
ConBench foundation-multimodal-models Python

[NeurIPS'24] Official implementation of paper "Unveiling the Tapestry of Consistency in Large Vision-Language Models".

35
sceneflow_from_blender
sceneflow_from_blender cv-stuttgart Python

Get 3D motion vectors / scene flow directly from Blender

35
GenoArmory
GenoArmory MAGICS-LAB Python

GenoArmory: A Unified Evaluation Framework for Adversarial Attacks on Genomic Foundation Models

35
iohk-monitoring-framework
iohk-monitoring-framework input-output-hk Haskell

This framework provides logging, benchmarking and monitoring.

34
wasm-render
wasm-render alanmacleod TypeScript

Software 3D renderer & rasteriser written in WASM/C & TypeScript to test / showcase WebAssembly and compare performance

34
goku
goku k-nasa Rust

goku is a HTTP load testing application written in Rust

34
benchbox
benchbox tboox C

🧀 The Benchmark Testing Box

34
SparkDataset
SparkDataset Spratiher9 Jupyter Notebook

Instant search for and access to many datasets in Pyspark.

34
lua-vs-vimscript
lua-vs-vimscript henriquehbr

A simple benchmark comparing Lua performance to Vimscript (because no one seems to care about these nowadays)

34
LREBench
LREBench zjunlp Python

[EMNLP 2022 Findings] Towards Realistic Low-resource Relation Extraction: A Benchmark with Empirical Baseline Study

34
MACSum
MACSum psunlpgroup Python

Dataset, metrics, and models for TACL 2023 paper MACSUM: Controllable Summarization with Mixed Attributes.

34
MileBench
MileBench MileBench Python

This repo contains evaluation code for the paper "MileBench: Benchmarking MLLMs in Long Context"

34
TypeEvalPy
TypeEvalPy secure-software-engineering Python

A Micro-benchmarking Framework for Python Type Inference Tools

34
benchmark-privesc-linux
benchmark-privesc-linux ipa-lab Shell

A comprehensive local Linux Privilege-Escalation Benchmark

34
Mess-benchmark
Mess-benchmark bsc-mem Shell

A Multiplatform benchmark designed to provide holistic, detailed and close-to-hardware view of memory system performance with family of bandwidth--lat...

34
indivi_collection
indivi_collection gaujay C++

A collection of std-like containers written in C++11. Features fast unordered flat map/set, configurable double-ended vector and sparse deque.

34
redis-benchmarks-specification
redis-benchmarks-specification redis Python

The Redis benchmarks specification describes the cross-language/tools requirements and expectations to foster performance and observability standards...

34
DafnyBench
DafnyBench sun-wendy Dafny

DafnyBench: A Benchmark for Formal Software Verification

34
tiny_qa_benchmark_pp
tiny_qa_benchmark_pp vincentkoc Python

Tiny QA Benchmark++ a micro-benchmark suite (52-item gold + on-demand multilingual synthetic packs), generator CLI, and CI-ready eval harness for ultr...

34
MSAD
MSAD boniolp Jupyter Notebook

[VLDB 2023] Model Selection for Anomaly Detection in Time Series

34
zapbench
zapbench google-research Python

The Zebrafish Activity Prediction Benchmark measures progress on the problem of predicting cellular-resolution neural activity throughout an entire ve...

34
CloudEval-YAML
CloudEval-YAML alibaba Python

☁️ Benchmarking LLMs for Cloud Config Generation | 云场景下的大模型基准测试

34
CIS-Settings
CIS-Settings krispayne Shell

CIS settings bootstrapper for Mac

33
rex
rex goanywhere Go

Pleasures for Web in Golang

33
saca-bench
saca-bench kurpicz Shell

Collection of Suffix Array Construction Algorithms (SACAs)

33
cmdbench
cmdbench manzik Python

Quick and easy resource usage monitoring and benchmarking for any command's CPU, memory, disk usage and runtime.

33
videocube-toolkit
videocube-toolkit huuuuusy Python

The official python toolkit for running experiments and evaluate performance on VideoCube benchmark @TPAMI2023

33
criterion-table
criterion-table nu11ptr Rust

Generate markdown comparison tables from `cargo-criterion` JSON output

33
rvv-bench-results
rvv-bench-results camel-cdr HTML

A collection of RISC-V Vector (RVV) benchmarks to help developers write portably performant RVV code. (Results)

33
imread_benchmark
imread_benchmark ternaus Python

I/O benchmark for different image processing python libraries.

33
BeHonest
BeHonest GAIR-NLP JavaScript

BeHonest: Benchmarking Honesty in Large Language Models

33
critdd
critdd mirkobunse Python

Critical difference diagrams with Python and Tikz

33
cpp2lua-buindings-battle
cpp2lua-buindings-battle bagobor C++

Lua <-> C++ bindings libraries benchmark

32
go-test-driven-development
go-test-driven-development gunjan5 Go

:hammer: :wrench: Test Driven Development :repeat: with Golang :hamster:

32
swords
swords p-lambda Python

The Stanford Word Substitution (Swords) Benchmark

32
gwvault
gwvault GoodwayGroup Go

ansible-vault CLI reimplemented in go

32
powerqe
powerqe ryanxingql Python

An unified framework of quality enhancement approaches for compressed images based on PyTorch.

32
NYU-VPR
NYU-VPR ai4ce Python

[IROS2021] NYU-VPR: Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymization Influences

32
useb
useb UKPLab Python

Heterogenous, Task- and Domain-Specific Benchmark for Unsupervised Sentence Embeddings used in the TSDAE paper: https://arxiv.org/abs/2104.06979.

32
OpenFed
OpenFed FederalLab Python

A Comprehensive and Versatile Open-Source Federated Learning Framework

32
HPO-B
HPO-B machinelearningnuremberg Python

[NeurIPS DBT 2021] HPO-B

32
divergent
divergent lechmazur

LLM Divergent Thinking Creativity Benchmark. LLMs generate 25 unique words that start with a given letter with no connections to each other or to 50 i...

32